I think this is not true. The way I learned the BWT, after encoding, you need to store the index of the first character (which is a tiny bit of extra information). https://web.archive.org/web/20170325024404/http://marknelson...
I think this is not true. The way I learned the BWT, after encoding, you need to store the index of the first character (which is a tiny bit of extra information). https://web.archive.org/web/20170325024404/http://marknelson...
I would have thought the more obvious approach would be to couple it to some kind of symbolic logic engine. It might transform plain language statements into fragments conforming to a syntax which that engine could then parse deterministically. This is the Platonic ideal of reasoning that the author of the post pooh-poohs, I guess, but it seems to me to be the whole point of reasoning; reasoning is the application of logic in evaluating a proposition. The LLM might be trained to generate elements of the proposition, but it's too random to apply logic.
Let's for the sake of argument assume current LLM's are a mirage but in the future some new technology emerges that offers true intelligence and true reasoning. At the end of the day such a system will also input text and output text, and output will probably piece-meal as current LLM's (and humans) do. So voila: They are also "stochastic text transformers".
Yes LLM's were trained to predict next token. But clearly they are not just a small statistical table or whatever. Rather, it turns out that to be good at predicting the next token, after some point you need a lot of extra capabilities, so that's why they emerge during training. All the "next-token-prediction" is just a way abstract and erasing name of what is going on. A child learning how to write, fill in math lessons etc. is also learning 'next token prediction' from this vantage point. It says nothing about what goes on inside the brain of the child, or indeed inside the LLM. It is a confusion between interface and implementation. Behind the interface getNextToken(String prefix) may either be hiding a simple table or a 700 billion-size neural network or a 100 billion sized neuron human brain.
Coroutines in Python are fantastically useful and allow more reliable implementation of networking applications. There is a complexity cost to pay but it's small and resolves other complexity issues with using threads instead, so overall you end up with simpler code that is easier to debug. "Hello world" (e.g., with await sleep(1) to make it non-trivially async) is just a few lines.
But coroutines in C++ are so stupendously complicated I can't imagine using them in practice. The number of concepts you have to learn to write a "hello world" application is huge. Surely just using callback-on-completion style already possible with ASIO (where the callback is usually another method in the same object as the current function), which was already possible, is going to lead to simpler code, even if it's a few lines longer than with coroutines?
Edit: We have a responsibility as senior devs (those of us that are) to ensure that code isn't just something we can write but that other can read, including those that don't spend their spare time reading about obscure C++ ideas. I can't imagine who in good faith thinks that C++ coroutines fall into this category.
As for how we got to here without:
1) Using large number processes/threads 2) Raw callback oriented mechanisms (with all the downsides) 3) Structured async where you pass in lambda's - benefit is you preserve the sequential structure and can have proper error handlign if you stick to the structure. Downside is you are effectively duplicating language facilities in the methods (e.g. .then(), .exception() ). Stack traces are often unreadable. I. 4) Raw use of various callback-oriented mechanisms like epoll and such, with the cost in code readability etc. and/or coupled with custom-written strategies to ease readability (so a subset of #3 really)
With C++ couroutines the benefit is you can write it almost like you usually do (line-by-line sequentially) even though it works asynchronously.
Dead Comment
I can’t be the only one.