BB(3, 4) > Ack(14) - Readit News

Can someone point me to some resources as to how to interpret the table, which I assume is some sort of description for a Turing machine?

nickdrozd · 2 years ago

The "states" (A, B, C) correspond to goto targets. The "colors" (0, 1, 2, 3) are runtime data. At each state, the current color is read, and an instruction is executed (print some color, move left or right, goto some state) based on which color is there. Transliterated into C it looks like this:

  #include "machine.h"
  
  int main(void)
  {
   A:
    switch (SCAN) {
      case 0:
        WRITE(1); RIGHT; goto B;
  
      case 1:
        WRITE(3); LEFT; goto B;
  
      case 2:
        WRITE(1); RIGHT; goto H;
  
      case 3:
        WRITE(2); RIGHT; goto A;
    }
  
   B:
    switch (SCAN) {
      case 0:
        WRITE(2); LEFT; goto C;
  
      case 1:
        WRITE(3); RIGHT; goto B;
  
      case 2:
        WRITE(1); LEFT; goto C;
  
      case 3:
        WRITE(2); RIGHT; goto A;
    }
  
   C:
    switch (SCAN) {
      case 0:
        WRITE(3); RIGHT; goto B;
  
      case 1:
        WRITE(1); LEFT; goto B;
  
      case 2:
        WRITE(3); LEFT; goto C;
  
      case 3:
        WRITE(2); RIGHT; goto C;
    }
  
   H:
    HALT;
  }

Question: are there any opportunities to rewrite this logic in a more "structured" style, or to make any other optimizations?

jamesaross · 2 years ago

I love these puzzles. GNU C supports a label as value for computed goto. This is useful for direct threaded dispatch. You trade off a branch instruction for an address lookup, but it makes the code more structured.

  int main(void) {
    void* A[] = {&&A0, &&A1, &&A2, &&A3};
    void* B[] = {&&B0, &&B1, &&B2, &&B3};
    void* C[] = {&&C0, &&C1, &&C2, &&C3};
    goto *A[SCAN];
    A0: WRITE(1); RIGHT; goto *B[SCAN];
    A1: WRITE(3); LEFT ; goto *B[SCAN];
    A2: WRITE(1); RIGHT; HALT; return 0;
    A3: WRITE(2); RIGHT; goto *A[SCAN];
    B0: WRITE(2); LEFT ; goto *C[SCAN];
    B1: WRITE(3); RIGHT; goto *B[SCAN];
    B2: WRITE(1); LEFT ; goto *C[SCAN];
    B3: WRITE(2); RIGHT; goto *A[SCAN];
    C0: WRITE(3); RIGHT; goto *B[SCAN];
    C1: WRITE(1); LEFT ; goto *B[SCAN];
    C2: WRITE(3); LEFT ; goto *C[SCAN];
    C3: WRITE(2); RIGHT; goto *C[SCAN];
  }

sans-seraph · 2 years ago

> Question: are there any opportunities to rewrite this logic in a more "structured" style, or to make any other optimizations?

Because A and C only jump to B it is possible to structure this using only loops and one boolean. Let us use Rust to demonstrate as it lacks GOTO:

    let mut a = true;
    loop {
        loop {
            if a { // state A
                match scan() {
                    0 => { write(1); right(); break }
                    1 => { write(3); left(); break }
                    2 => { write(1); right(); return }
                    3 => { write(2); right() }
                }
            } else { // state C
                match scan() {
                    0 => { write(3); right(); break }
                    1 => { write(1); left(); break }
                    2 => { write(3); left() }
                    3 => { write(2); right() }
                }
            }
        }

        a = loop { // state B
            match scan() {
                0 => { write(2); left(); break false }
                1 => { write(3); right() }
                2 => { write(1); left(); break false }
                3 => { write(2); right(); break true }
            }
        }
    }

Of course it is possible to rewrite this as a single loop if you are willing to accept two bits of extra state rather than one.

dataflow · 2 years ago

What's the initial state supposed to be?

frutiger · 2 years ago

Each row is a state and each column is the symbol that was just read off the tape. So the first row & first column mean "I have just read symbol 0 and am currently in state A".

The cell in the table describes which actions to perform. The first row & first column has "1RB" which means: "replace the symbol on the tape with '1', shift 1 symbol to the right on the tape and switch to state 'B'".

The state 'Z' corresponds to the halting state.

carapace · 2 years ago

In Python:

    def L():
        global index, tape
        if index: index -= 1
        else: tape.insert(0, 0)

    def R():
        global index, tape
        index += 1
        if index >= len(tape): tape.append(0)

    table = {
     ('A', 0): (1, R, 'B'),
     ('A', 1): (3, L, 'B'),
     ('A', 2): (1, R, 'Z'),
     ('A', 3): (2, R, 'A'),
     ('B', 0): (2, L, 'C'),
     ('B', 1): (3, R, 'B'),
     ('B', 2): (1, L, 'C'),
     ('B', 3): (2, R, 'A'),
     ('C', 0): (3, R, 'B'),
     ('C', 1): (1, L, 'B'),
     ('C', 2): (3, L, 'C'),
     ('C', 3): (2, R, 'C'),
     }

    state = 'A'
    tape = [0]
    index = 0
    while state != 'Z':
        tape[index], direction, state = table[state, tape[index]]
        direction()

LegionMammal978 · 2 years ago

There's a brief explanation at [0]. (Note that "1RZ" is understood to be a halting transition, since state "Z" has no rules.) Wikipedia also has a more elaborated example of a TM state table [1]. You can see the trace of this particular TM at [2].

[0] https://bbchallenge.org/story#turing-machines

[1] https://en.wikipedia.org/wiki/Turing_machine#Formal_definiti...

[2] https://bbchallenge.org/1RB3LB1RZ2RA_2LC3RB1LC2RA_3RB1LB3LC2...

sw1sh · 2 years ago

I've made a small repository with current record holders that also shows examples of running these machines with Wolfram Language: https://datarepository.wolframcloud.com/resources/The-Busy-B.... I guess I also need to update it now.

wwalexander · 2 years ago

I found this quite helpful: https://nickdrozd.github.io/2020/10/04/turing-machine-notati...

Despite a CS undergrad I don’t recall really learning any of these canonical representations of TMs before.

Deleted Comment

treyd · 2 years ago

Each triplet seems to be the symbol to write, and direction to move in, and the next state. You can think of the behavior of a turing machine as a function of the current state and the symbol read that's repeated iteratively.

Discord links! As citations for major results in foundational computer science!

lisper · 2 years ago

Why not? The idea that the only valid way to publish a scientific result is in a (so-called) peer-reviewed journal is a relic from 200 years ago when the scientific community was small enough to fit within the monkeysphere [1]. It is being clung to only because it is profitable for a small group of powerful academics and publishing houses, not because it has any actual merit in terms of the advancement of science. In fact, it's probably in no small measure responsible for contemporary replication crises. I'm as staunch an advocate of the scientific method as you will hope to find (I'm writing a book about it!) but IMHO traditional peer review is long past its sell-by date.

[1] https://en.wikipedia.org/wiki/Dunbar%27s_number

TeMPOraL · 2 years ago

Because Discord is a proprietary glorified IRC SaaS; its contents are, by nature, ephemeral and under control of the vendor. I'd expect such links to rot very quickly.

Collaborating on Discord is fine. Important results, including citations backing them, should really be published or at least replicated in more durable medium that's less susceptible to link rot, and easier to archive. Today, that's PDFs in paper repositories, or even regular blog posts.

vlovich123 · 2 years ago

To be fair, the responsibility for the replication crises is much trickier. It’s based on human motivations and poor science:

1. Outright fraud

2. Lack of independent verification of results before building up a body of work resulting in a bunch of garbage. Contributes to making #1 “easy”.

3. Financial pressures to publish or perish contributing to 1 & 2. If peer review didn’t exist you’d still something similar about producing “recognized” results. This is also probably why we haven’t done any major breakthroughs in particle physics which now has a much longer term thinking phase to come up with anything.

The biggest problem with peer review is actually the establishment of an orthodoxy which discourages itself being upended by controlling the career success of anyone who dare challenge it - you have to support the orthodoxy to get career success and you fail to get recognition if your idea challenges the peer reviewers ideas and careers. That being said, such pressures existed before, but at least you could try to build your own following and it was competing orthodoxies instead of a single one winning out.

LegionMammal978 · 2 years ago

FWIW, it's a public Discord server, you can find the invite link at the top-right of https://bbchallenge.org.

Also, I'd consider these more as attributions than citations. All the major arguments supporting the results have been replicated in the blog post (in a more rigorous form), so it can stand on its own: the Discord links just provide historical context for those interested.

UncombedCoconut · 2 years ago

As a member of these chats: it's often like hitting on an idea on a break-room blackboard and working it out, except the interaction can be cited. That's a positive change, if we can follow through and add to the literature in due time. Here's hoping.

calfuris · 2 years ago

That's fine, but the citation shouldn't be in the form of a Discord link, or at least not exclusively in that form. Make a copy of the relevant messages, host them elsewhere, and include that copy in the citation. Discord has been pretty good so far but being a durable archive has never been their core selling point so I don't trust that to continue indefinitely.

Analemma_ · 2 years ago

I understand the complaint here, but a lot of recent impressive progress in mathematics has been by rapid collaboration + iteration (e.g. the project to improve Zhang's prime gap bound), and it may be the case that different communication tools are not easily fungible with Discord in this regard. You have to go where the people actually are.

wizzwizz4 · 2 years ago

But Discord isn't citable. Somebody needs to archive the Discord and make that available through the proper channels (e.g. a website, a book, the Internet Archive).

j2kun · 2 years ago

Finding bigger busy beaver numbers is not exactly foundational. More like recreational. If it were foundational it would be peer reviewed in a journal article, not posted on a blog.

mcherm · 2 years ago

> If it were foundational it would be peer reviewed in a journal article, not posted on a blog.

What I think you are doing here is to DEFINE "foundational work" as something that gets published in a journal.

I don't mind if you use that definition, but if you do then the fact that all foundational work is published in journals is not insightful or informative, it is merely the definition.

If, on the other hand, you intended for the statement to say something meaningful about how important scientific and mathematical work is communicated, then you need a different definition of "foundational". And you would have to look to that definition to decide whether this work was foundational, because if it was then it disproves your hypothesis: it would be a counter example that illustrates that some foundational work is shared in places other than journals.

dailykoder · 2 years ago

Better than math proofs on 4chan[1], huh?

[1] https://en.wikipedia.org/wiki/Superpermutation#Lower_bounds,...

idlewords · 2 years ago

I thought that was great, too. They should also stream themselves solving this stuff on Twitch.

It is sometimes thought that extremely long-running Turing machine programs must be deeply complicated, or "spaghetti code". This new reigning champion program is a counterexample. All things considered, it is relatively simple.

There are three states: A, B, C. (States are just like goto targets in C.) State B passes control to A and C, but states A and C don't "know about" each other; they only pass control back to B. This is a sort of modular construction, whereas in true spaghetti code each state would be able to pass to all the others.

This program has some other interesting features. It never prints a blank (that is, whenever it scans a 0, it prints 1, 2, or 3). Additionally, every instruction changes either the state or the color -- there are no "lazy instructions" like B1 -> 1LB that just move position without changing anything else.

There is some debate in the bbchallenge project regarding the current long-running champions: do their properties reflect those of the actual longest-running machine of that size, or are their properties just the easiest to automatically search for and prove things about, as a kind of streetlamp effect? There's no way to know, until the entire search space has been ruled out, either definitely or heuristically. (Every size past BB(5, 2) contains chaotic, pseudorandom machines that are expected to run forever, but can't be proven to do so without big advances in number theory.)

Though I suspect that no long-running machine can be utterly chaotic, at least when you look at the patterns it produces on the tape. To run for a long time, a machine must simulate some sort of higher-level rule: if it were dumping symbols on the tape at random, then it would very soon reach a halting configuration, a cyclical configuration, or some other simplified pattern. Still, one can have long-running machines that do simulate something chaotic on a higher level, spending an inordinate amount of time between each high-level step until it halts.

> streetlamp effect

I agree completely, all of these kinds of conjectures are shaped by what is detectable. If there are any "dark matter" programs out there, they by definition will be difficult to find. That said, I find it entirely plausible that the champions will win by exploiting complex and exotic mathematical facts, while the implementations of the math do not themselves need to be complex or exotic at the code level.

More rambling thoughts about this: https://nickdrozd.github.io/2021/09/25/spaghetti-code-conjec...

AlotOfReading · 2 years ago

Handwaving here, but I think longest running machines can't follow a specific structure in general.

Rough sketch of an argument:

Let's construct a number from all intermediate states of the machine concatenated. The number of digits of this number should correspond to the runtime (sketchy). We only care about the halting machines, so it's finite. We know that it must be unique because if a smaller machine computes the same number, we could get a bigger number by simply running the smaller program and doing other nonsense. That means the biggest programs are komolgorov optimal, and the numbers themselves should be k-trivial and thus nearly but not quite computable. Since they're not computable, the programs themselves can't follow a structure to generate them (since that would be computable in turn for larger values).

kevinventullo · 2 years ago

At a high level, wouldn’t you expect Champions to be chaotic in the limit? As in, the halting problem tells us that any increasing sequence of Champions is not computable.

tromp · 2 years ago

> whereas in true spaghetti code each state would be able to pass to all the others.

An n-state s-symbol TM can transition to n other states at most (or halt). Thus, for s=4 or s=2, only the smallest of TMs could be spaghetti like.

The new BB(3,4) record holder is

         0   1   2   3
    A   1RB 3LB 1RZ 2RA
    B   2LC 3RB 1LC 2RA
    C   3RB 1LB 3LC 2RC

with the triple (t',d, s') in row s column t specifying the transition from state s with symbol t under the tape head. the machine overwrites symbol t with t', moves the tape Left/Right according to direction d, and changes state from s to s', halting if s'==Z.

This is 3*4*log2(4*2*log2(4+1)) or about 64 bits of information. Meanwhile, in only 49 bits, BBλ(49) far exceeds Graham's number [1].

[1] https://oeis.org/A333479

sligocki · 2 years ago

Counting the number of distinct TMs is not a simple task. The way you count it is the most broad (the count of all tables of values where each cell can have any (symbol, direction, state) combination). But this is a significant overcount of the bits needed to describe an arbitrary TM. for the BB(3, 4) case, there are only about 600B distinct TMs using the Tree Normal Form aka Brady's algorithm (https://nickdrozd.github.io/2022/01/14/bradys-algorithm.html) which comes out to <40 bits.

Similarly, 2^n would be a huge overcount of the number of n-bit closed terms, which is given by OEIS sequence A114852 [1]. There are 36140122614 ~ 2^35 closed terms of size 49 bits (length of a straightforward self-delimiting encoding). So in that sense it's still a smaller lambda term computing an incomprehensibly larger output.

[1] https://oeis.org/A114852

jonahx · 2 years ago

Do you know of any (hand-wavy is ok) intuitive explanation for why this machine will halt, beyond the inductive proof's given in your article? Just watching the machine run, I would guess at some kind of infinite behavior. It is remarkable that this is not the case.

Aardwolf · 2 years ago

Hmm, so I think the 1R in 1RZ don't matter then for this program and are arbitrarily chosen, since it halts there so it doesn't matter what's still written on the tape or where the tape head moves?

(Even writing of the 1 doesn't matter, though a 0 would have been suboptimal I guess. A 2 was already written on the tape there, it's being replaced by a 1 but the 2 counted just as much for 'number of symbols on the tape')

nilstycho · 2 years ago

Can you help me understand the log2(4+1) term?

I calculate 3*4*log2(4*2*log2(4+1)) ≈ 51.

I (a layperson) would have thought the calculation would be 3*4*log2(4*2*4) = 60.

Is it perhaps 3*4*log2(4*2*log2(3*3*4-1)) ≈ 64?

I miscalculated. It should be 3*4*log2(4*2*(3+1))= 60 bits of information.

casercaramel144 · 2 years ago

I was curious as to how it works, so I implemented here: turingmachine.io/?import-gist=c862f28918f3d889f964797694d28fcc

If you run it for a bit you see what's going on, State B turns 0's into 2's and 1's into 1's transitioning to C and state C turns 3 -> 2's and transitions to A. So you just iteratively lengthen your run of 3's exponentially since it requires a full pass through all the 3's to fix a 2->1.

tux3 · 2 years ago

It's fairly easy to make a turing machine that goes exponential and blows up forever.

But the really tricky part to understand is why it eventually stops. After an unimaginably high number of steps.

mikewarot · 2 years ago

This all sounds like extreme Code Golf to me.

Here's a tangent to explore:

A BitGrid[1] (my personal hobby horse) only has 4 bits of state per cell, so a 4x4 grid of cells can't count more than 2^64, no matter what. Finding the highest it could count would be interesting. For small grids, the edge connections would dominate the outcome.

[1] https://esolangs.org/wiki/Bitgrid

[2] https://github.com/mikewarot/Bitgrid

supriyo-biswas · 2 years ago

immibis · 2 years ago

ziofill · 2 years ago

There are only so many Turing machines that we can possibly describe with a not too large amount of symbols such as 1RB3LB1RZ2RA_2LC3RB1LC2RA_3RB1LB3LC2RC. The fact that some of these can make such a crazy large number of steps before halting to me is mind blowing.

There are 2^60 such 3 state 4 symbol Turing machines. A 49-bit lambda term whose output (normal form) exceeds Graham's Number should blow your mind even more.

2^60 is very little! Is it known what fraction of them has an insanely large run time?

ks2048 · 2 years ago

I don’t know why, but these probably-useless-results (which frankly I don’t 100% understand) intrigue me more than the incredibly-useful LLM advances. I guess because I’m naturally drawn to “simple” mathematical truths more than “messy” engineering results.