Readit News logoReadit News
rryan commented on The bitter lesson is coming for tokenization   lucalp.dev/bitter-lesson-... · Posted by u/todsacerdoti
rryan · 2 months ago
Don't make me tap the sign: There is no such thing as "bytes". There are only encodings. UTF-8 is the encoding most people are using when they talk about modeling "raw bytes" of text. UTF-8 is just a shitty (biased) human-designed tokenizer of the unicode codepoints.
rryan commented on Transformers Without Normalization   jiachenzhu.github.io/DyT/... · Posted by u/hellollm
kouteiheika · 6 months ago
If true this is very nice incremental improvement. It looks like it doesn't meaningfully improve the capabilities of the model, but is cheaper to compute than RMSNorm (which essentially all current state of art LLMs use) which means faster/cheaper training.
rryan · 6 months ago
RMSNorm is pretty insigificant in terms of the overall compute in a transformer though -- usually the reduction work can be fused with earlier or later operations.
rryan commented on Can Gemini 1.5 read all the Harry Potter books at once?   twitter.com/deedydas/stat... · Posted by u/petulla
rryan · a year ago
ML 101: Do not evaluate on the training data.

Yes of course it can, because they fit in the context window. But this is an awful test of the model's capabilities because it was certainly trained on these books and websites talking about the books and the HP universe.

rryan commented on (next Rich)   clojure.org/news/2023/08/... · Posted by u/poidos
rryan · 2 years ago
Thanks for everything, Rich. You inspired me repeatedly.
rryan commented on Google doesn’t want employees working remotely anymore   theverge.com/2023/6/7/237... · Posted by u/dlb007
yieldcrv · 2 years ago
When I lived in Chelsea, Google contacted me and insisted I fly out to San Francisco for a curated tour of Mountain View

“Where leadership roles had to be”

I said I wanted to walk to work to the giant billion dollar office down the street, I love Chelsea, I love the Meatpacking District, I love the Highline and the things around that office, I love models

But “roles with direct reports had to be in mountain view” and they assured me I would be so impressed with the highly coveted Mountain View and highly coveted Google

the only thing seared in my brain from that trip was standing at an elevator that had a warning sign that I might get cancer if I use it, in the middle of a sprawling boring unwalkable suburb and a janitor being my best source at the time that its a boilerplate disclaimer. He was right. But that was my experience.

rryan · 2 years ago
Yea uh, don't know what was going on there but there are roles with direct reports that aren't in mountain view (unless you mean, like in 2004). NYC has 1000s of googlers.
rryan commented on AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head   arxiv.org/abs/2304.12995... · Posted by u/iamflimflam1
rryan · 2 years ago
This is ... not what I expected. It's basically wiring up pre-trained models to ChatGPT via a router and "modality transformations" (a.k.a speech-to-text and text-to-speech).

I expected it to be a GPT-style model that processes audio directly to perform a ton of speech and maybe speech-text tasks in a zero-shot manner.

rryan commented on $6.96B was raised by private longevity companies in 2022   spannr.com/reports/2022-l... · Posted by u/deeel
philipkglass · 3 years ago
It means that there are all sorts of barriers in place to developing drugs or therapies that prevent or reverse aging -- because you couldn't get FDA approval for something that it doesn't consider disease.

I don't see how that is a significant barrier to an anti-aging drug that actually works. Pick any one of the many recognized medical conditions that is strongly correlated with old age, like osteoporosis. Prove that your anti-aging drug effectively and safely treats osteoporosis in the elderly. The FDA will approve it. If your osteoporosis treatment also cures wrinkles and gray hair as a side effect, the FDA won't object. And once the drug is approved for one condition it can be prescribed by off-label for other conditions. Everyone will quickly learn what it's useful for, just like how people started using semaglutide for weight loss when it was "officially" still just a diabetes treatment.

rryan · 3 years ago
Except when your doctor won't prescribe it you offlabel because you don't have a condition they were taught about in med school
rryan commented on Ask HN: Google spam filter getting worse?    · Posted by u/jgwil2
rryan · 3 years ago
In the past few months I've been getting much more fake order / payment spam.
rryan commented on Neurons in a dish learn to play Pong   nature.com/articles/d4158... · Posted by u/rogerian
lern_too_spel · 3 years ago
Not much. A transformer trained on multiple senses can learn the sound that an animal makes and associate it with seeing that animal. It can also learn how another agent reacts after it says a word.

The huge difference is actually between animal reflexes and learned behavior. Reflex is built-in. I didn't learn to kick my leg in response to a tap on the patellar tendon.

rryan · 3 years ago
I agree that a Transformer is an example of a "reflexive" behavior because it learns to react in a context (via gradient descent rather than evolution as the learning algorithm). It's a conditional categorical distribution on steroids.

I also agree it's not much different than what's going on in this petri dish with pong.

But I don't think that's a profound statement.

What I'm saying is that calling what a Transformer does "language development" isn't accurate. A Transformer can't "develop" language in that sense, it can only learn "reflexive" behavior from the data distribution it's trained on (it could never have produced that data distribution itself without the data existing in the first place).

u/rryan

KarmaCake day2813March 14, 2009View Original