justinpombrio (u/justinpombrio)

justinpombrio commented on Show HN: I made a tool to port tweets to Bluesky mantaining their original date bluemigrate.com... · Posted by u/nols05

Kye · 5 months ago

Some people got the idea it has something to do with some song because (not an actual dictionary) Urban Dictionary said so when the actual meaning is "sky tweet."

justinpombrio · 5 months ago

> (not an actual dictionary) Urban Dictionary

<troll> The definition of dictionary is just "Words about words" (source: Urban Dictionary), so I'd say that Urban Dictionary qualifies. </troll>

justinpombrio commented on What can be computed? A practical guide to the theory of computation (2018) [pdf] softouch.on.ca/kb/data/Wh... · Posted by u/nill0

jltsiren · 6 months ago

This is unnecessary nitpicking. You can easily derive a logical contradiction of your choice by assuming that a system that can be formally modeled can prove something that cannot be proven within a formal system. If a specific theorem doesn't prove that, a simple corollary will.

If you want to escape the limits of computability, you have to assume that the physical Church–Turing thesis is false. That the reality allows mechanisms that cannot be modeled formally or simulated on any currently plausible computer.

justinpombrio · 6 months ago

> by assuming that a system that can be formally modeled can prove something that cannot be proven within a formal system

Something that cannot be proven within which formal system? Every true statement can be proven by some formal system. No formal system can prove all true statements.

(This is all about Godel incompleteness. Turing incomputability does apply though, see the sibling comments! It's important to keep them separate, they're saying different things.)

justinpombrio commented on What can be computed? A practical guide to the theory of computation (2018) [pdf] softouch.on.ca/kb/data/Wh... · Posted by u/nill0

Tainnor · 6 months ago

I'm confused by this. Why are you talking about Gödel?

LLMs, like any model of computation, are subject to the limits of Turing Machines, so they cannot for example solve the halting problem.

justinpombrio · 6 months ago

Oof, I pattern matched OP to a different argument I've seen a lot. That's embarrassing.

Yes, that's correct, it's provably impossible to construct an LLM that, if you ask it "Will this program halt? <program>", answers correctly for all programs. Likewise humans, under the assumption that you can accurately simulate physics with a computer. (Note that QM isn't an obstacle, as you can emulate QM just fine with a classical computer with a mere exponential slowdown.)

justinpombrio commented on What can be computed? A practical guide to the theory of computation (2018) [pdf] softouch.on.ca/kb/data/Wh... · Posted by u/nill0

ccvannorman · 6 months ago

What a wonderful text. Easy to read, concise, clear, interesting -- and above all, important.

I would add context for 2025 about the fundamental limits this places on what (modern) AI is in principal capable of. Perhaps some "non-computable" features would need to be hard-coded into AI, so that it could at least better approximate the types of incomputable problems we might ask it to do?

Also, a search of the text for "conscious" does not yield anything, which is probably a good thing. This text also reminds me of the questions like, "What does it mean to be conscious?" and, "How are human brains able to reason about (things like) incomputability, which in some sense, computers as we currently understand them could never do?" and, "What specifically beyond pure mathematics does a brain have or need in order to be conscious enough to reason about such things?"

justinpombrio · 6 months ago

Gödel's Incompleteness Theorem places a limit on what you can prove within a formal system. Neither humans nor LLMs are a formal system, so it says nothing about them.

Someone's going to think that, since you can formally model computer programs (and thus formally model how an LLMs runs), that means that Gödel's Incompleteness Theorem applies to computer programs and to LLMs. But that's irrelevant; being model-able doesn't make something a formal system!

"Formal system" has a very technical meaning: it has a set of consistent axioms, and the ability to enumerate formal proofs using those axioms. For example, Zermelo-Fraenkel set theory is a formal system [1]. It has nine formal axioms, which you can see listed in its Wikipedia article. Utterly different kind of thing than humans or LLMs. Comparing an LLM to ZFC is like comparing a particular watermelon to the number 4.

[1] https://en.wikipedia.org/wiki/Zermelo%E2%80%93Fraenkel_set_t...

justinpombrio commented on Is this the simplest (and most surprising) sorting algorithm ever? (2021) arxiv.org/abs/2110.01111... · Posted by u/gnabgib

teach · 6 months ago

I used to teach this to beginners all the time. I called it "exchange sort", though I'm not sure where I ever got that name from.

Beginners to sorting are often still somewhat struggling with for loops (especially the three-part C kind), struggling with indexing an array, struggling with swaps. It's overwhelming.

This is kind-of like pseudocode for a simple sort -- it has the same rough outline, but no tricky bits. It allows them to get their brains around the basic algorithm, and only after that you trace through an execution on the board to show them that it wastes a lot of time moving things around.

Then you can show them insertion sort as an optimization, and from that move on to fancier ones.

justinpombrio · 6 months ago

Can you give a reference to your teaching materials? The way this sort works is bizarre enough that I'm curious what you said about it. (Also to verify that this is actually the algorithm you taught, given how many other people thought it was something else.)

justinpombrio commented on Is this the simplest (and most surprising) sorting algorithm ever? (2021) arxiv.org/abs/2110.01111... · Posted by u/gnabgib

thaumasiotes · 6 months ago

I expect them to have exactly the same number of swaps in every iteration of the loop past the first one, because they are doing exactly the same thing. It's not just the same number of swaps. It's essentially the same set of swaps, in a different order. In the iteration where i = 1, the number of swaps may differ. Was I wrong about that?

justinpombrio · 6 months ago

Yeah, 2410 vs. 2509 swaps not including the first iteration. You should try measuring these things yourself, I'm not sure if it's doing what you think it's doing.

justinpombrio commented on Is this the simplest (and most surprising) sorting algorithm ever? (2021) arxiv.org/abs/2110.01111... · Posted by u/gnabgib

thaumasiotes · 6 months ago

> The number of swaps is nearly the same between Bubble Sort and Can't Believe Sort. For a random list with 100 elements, it's 2505 swaps vs. 2513 swaps.

Why is there any difference? Does the entire difference come from the iteration where i = 1?

> Bubble Sort always does exactly n(n-1)/2 comparisons

Bubble sort can do less than this. When you're bubbling down into a sorted list, and you find a value lower than you are, you can assume that all values below that one are also lower than you are and terminate the inner loop early.

justinpombrio · 6 months ago

> Why is there any difference? Does the entire difference come from the iteration where i = 1?

Sometimes it's more swaps, and sometimes it's less. The vague pattern looks like Can't Believe Sort does slightly fewer swaps on larger lists, but that could also have been noise.

Why do you expect them to have exactly the same number of swaps?

> Bubble sort can do less than this.

Oh sorry yeah, I implemented bubble sort without any sort of early return, for a more direct comparison.

justinpombrio commented on Is this the simplest (and most surprising) sorting algorithm ever? (2021) arxiv.org/abs/2110.01111... · Posted by u/gnabgib

thaumasiotes · 6 months ago

> In that latter case I'd really appreciate a better explanation for what you think is going on, and why you think this is "just an unoptimised Bubble Sort".

I can try to explain that. The algorithm really is doing a bubble sort, with some unnecessary extra work added. It also does a bit of necessary extra work.

Here is the useful work done by the algorithm:

1. First, preprocess the list so that the maximal element occurs first. (It's a 1-indexed list, so position 1 holds the highest element of the list.)

2. Now, we consider an invariant: the part of the list up to (and including) the maximal element is always sorted. Right now that consists of indices 1-1, and since that range contains only one value it's necessarily sorted. We will use a variable ("i") to remember where the maximal element is.

3. Now, the outer loop: we start at the index immediately after the maximal element, index i+1. Whatever value is in there, we bubble it to the left (this is the inner loop), in the manner of a standard bubble sort: we compare it to index i, and if it's lower (which is always, since i holds the maximal element), we swap. Then we compare index i to index i-1, and so on. Once we find an element that is smaller than the one we've been bubbling to the left, we stop, and our invariant is restored: the list up to the maximal element is still sorted. But the space covered by the invariant has increased by one slot. We increment i.

4. The loop continues until i points to the last slot in the list, at which point we're done. The invariant tells us the the list up to i is sorted, and since i points to the last value in the list, the whole list is sorted.

-----

The algorithm I've described sends i forward from the beginning to the end of the list and j backwards from i+1 to the beginning of the list. The algorithm in the paper does all the same work, but it also does more work, instead having j cross the entire list every time. (It also sends j forwards, like i, but this doesn't matter.†) The only reason for this is to make the algorithm simpler to write down. This is why it's "unoptimized bubble sort". Note that, in the algorithm of the paper, once the maximum value has been swapped into i, the check A[i] < A[j] can never succeed, but we ignore that and keep advancing j instead of just terminating the j loop. This looks especially stupid because the maximal value will always occur before i (except when i = 1, which is the stage that, in my algorithm, is called "preprocessing").

In a bubble sort, if your cursor is at 5, and you want to move A[6] to index 3, you need to make 3 swaps and 3 comparisons. In the paper's sort, when your cursor is at 5 and you want to move A[6] to index 3, you need to make 3 swaps and n comparisons, where n is the length of the full list.

-----

If you're interested in why the j loop is the same thing as a bubble sort, here's an expansion on the example above:

We're in the process of bubble sorting [1 4 8 11 24 5], looking at the 5. Because 5 is more than 4 but less than 8, we want it to ultimately end up at index 3. Everything else will shift up by one index to accommodate this.

In a bubble sort, we swap the 5 down into index 5, then we swap the 5 down into index 4, and then we swap it down into index 3.

    swap(&a6, &a5);
    swap(&a5, &a4);
    swap(&a4, &a3);

In the paper's sort, we do exactly the same thing from the other direction:

    int* temp;
    *temp = a6;      /* we want to put this into its correct position */
    swap(&a3, temp); /* done, but we need to remember what was in a3 so we can swap it up to a4 */
    swap(&a4, temp); /* done, but now we need to remember what was in a4 so we can swap it up to a5 */
    swap(&a5, temp); /* take a guess... */
    swap(&a6, temp);

but there's a slight optimization where instead of using a temporary variable to hold the values we're swapping, we just use the space after the maximal element (which makes `swap(&a6, temp)` redundant - after the optimization, those are the same address).

† It matters a little bit - a bubble sort is free to terminate the inner loop as soon as the value being bubbled has found its correct position. But in the paper's algorithm, we have to find the correct position first, which is done by scanning every index that contains a lower value until we eventually find the first one that doesn't. That work is optimized out of a bubble sort. However, since the paper's sort is committed to scanning the entire list in the inner loop anyway, the fact that the order of bubbling is wrong doesn't require any more extra scanning beyond what they've already committed to.

justinpombrio · 6 months ago

The number of swaps is nearly the same between Bubble Sort and Can't Believe Sort. For a random list with 100 elements, it's 2505 swaps vs. 2513 swaps.

The number of comparisons is of course about twice as large, because Bubble Sort always does exactly n(n-1)/2 comparisons while Can't Believe sort always does exactly n*n comparisons.

https://play.rust-lang.org/?version=stable&mode=debug&editio...

justinpombrio commented on Is this the simplest (and most surprising) sorting algorithm ever? (2021) arxiv.org/abs/2110.01111... · Posted by u/gnabgib

javierlc · 6 months ago

This is well known (google trivial sorting network!), and it would never be published anywhere peer-reviewed.

Good reminder that Arxiv is not peer-reviewed :/

justinpombrio · 6 months ago

I did search for "trivial sorting network", but the only networks that were called trivial were the ones for exactly two elements, while this algorithm sorts an arbitrary number of elements.

Could you link to what you're talking about? And what's its big-O runtime?

justinpombrio commented on After 20 years, math couple solves major group theory problem quantamagazine.org/after-... · Posted by u/isaacfrond

justinpombrio · 6 months ago

The paper: https://arxiv.org/abs/2410.20392