Loading model: waiting WebGPU error: WebGPU is NOT supported on this browser.
Im on macOS and Safari.
Lowest latency of DDR5-6400 on normal PC starting at 60ns+
Lowest latency of VRAM on GeForce RTX 4090 starting at 14 ns
Lowest latency of Apple M1 Memory starting at 5 ns, its more like L3 cache
And on Apple M chip, this ultrafast memory is available for CPU, GPU and NPU.
https://www.anandtech.com/show/17024/apple-m1-max-performanc...https://chipsandcheese.com/p/microbenchmarking-nvidias-rtx-4...
Cool.
Sounds a lot like standing on the backs of giants to me. Why would this blow minds that with newer compute and full hindsight, someone could reproduce something more efficiently?
I feel like I’m missing the point and Google didn’t illuminate any deep article that represented this achievement in novel terms.
Even the best LLm's are still at the level of children in this abstraction, so they can't make "quality" jokes. They also suffer from not having a unique personality and being the average of everything. Until this is addressed, don't expect great jokes to come out of AI. It's almost the most challenging discipline. I wouldn't be afraid to make the comparison that the "joke" is the real Turing test.