chaeronanaut (u/chaeronanaut)

chaeronanaut commented on Reasoning models don't always say what they think anthropic.com/research/re... · Posted by u/meetpateltech

lsy · a year ago

The fact that it was ever seriously entertained that a "chain of thought" was giving some kind of insight into the internal processes of an LLM bespeaks the lack of rigor in this field. The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it! They aren't references to internal concepts, the model is not aware that it's doing anything so how could it "explain itself"?

CoT improves results, sure. And part of that is probably because you are telling the LLM to add more things to the context window, which increases the potential of resolving some syllogism in the training data: One inference cycle tells you that "man" has something to do with "mortal" and "Socrates" has something to do with "man", but two cycles will spit those both into the context window and lets you get statistically closer to "Socrates" having something to do with "mortal". But given that the training/RLHF for CoT revolves around generating long chains of human-readable "steps", it can't really be explanatory for a process which is essentially statistical.

chaeronanaut · a year ago

> The words that are coming out of the model are generated to optimize for RLHF and closeness to the training data, that's it!

This is false, reasoning models are rewarded/punished based on performance at verifiable tasks, not human feedback or next-token prediction.

chaeronanaut commented on Grandmaster-Level Chess Without Search arxiv.org/abs/2402.04494... · Posted by u/jonbaer

shmageggy · 2 years ago

Yeah, they need to compare against the latest BT2 policy head. It's probably about the same performance.

chaeronanaut · 2 years ago

BT2 is old news, we have BT4 now

chaeronanaut commented on Blowing up my compile times for dubious benefits claytonwramsey.github.io/... · Posted by u/Tyrubias

chaeronanaut · 3 years ago

An excellent explanation of Magic Bitboards can be found here: https://analog-hors.github.io/site/magic-bitboards/

chaeronanaut commented on The case against a C alternative c3.handmade.network/blog/... · Posted by u/lerno

woodruffw · 4 years ago

> Any C alternative will be expected to be on par with C in performance. The problem is that C have practically no checks, so any safety checks put into the competing language will have a runtime cost, which often is unacceptable. This leads to a strategy of only having checks in "safe" mode. Where the "fast" mode is just as "unsafe" as C.

I don't think this is true, in the general case: Rust has shown that languages can be safe in ways that improve runtime performance.

In particular, languages like Rust allow programmers to express stronger compile-time constraints on runtime behavior, meaning that the compiler can safely omit bounds and other checks that an ordinary C program would require for safety. Similarly, Rust's (lack of) mutable aliasing opens up entire classes of optimizations that are extremely difficult on C programs (to the extent that Rust regularly exposes bugs in LLVM's alias analysis, due to a lack of exercise on C/C++ inputs).

Edit: Other examples include ergonomic static dispatch (Rust makes things like `foo: impl Trait` look dynamic, but they're really static under the hood) and the entire notion of a "zero-cost abstraction" (Rust's abstractions are no worse than their "as if" equivalent, meaning that the programmer is restricted in their ability to create suboptimal implementations).

chaeronanaut · 4 years ago

this pretty much summarises my opinion - one nitpick - i assume you meant "omit bounds and other checks", not "emit bounds and other checks" which seems to mean the opposite of what you're intending