Readit News logoReadit News
yogrish commented on Yann LeCun to depart Meta and launch AI startup focused on 'world models'   nasdaq.com/articles/metas... · Posted by u/MindBreaker2605
fxtentacle · a month ago
LLMs and Diffusion solve a completely different problem than world models.

If you want to predict future text, you use an LLM. If you want to predict future frames in a video, you go with Diffusion. But what both of them lack is object permanence. If a car isn't visible in the input frame, it won't be visible in the output. But in the real world, there are A LOT of things that are invisible (image) or not mentioned but only implied (text) that still strongly affect the future. Every kid knows that when you roll a marble behind your hand, it'll come out on the other side. But LLMs and Diffusion models routinely fail to predict that, as for them the object disappears when it stops being visible.

Based on what I heard from others, world models are considered the missing ingredient for useful robots and self-driving cars. If that's halfway accurate, it would make sense to pour A LOT of money into world models, because they will unlock high-value products.

yogrish · a month ago
I think World models is way to go for Super Intelligence. One of teh patent i saw already going in this direction for Autonomous mobility is https://patents.google.com/patent/EP4379577A1 where synthetic data generation (visualization) is missing step in terms of our human intelligence.
yogrish commented on What is a color space?   makingsoftware.com/chapte... · Posted by u/vinhnx
yogrish · 4 months ago
yogrish commented on Things we learned about LLMs in 2024   simonwillison.net/2024/De... · Posted by u/simonw
mvkel · a year ago
I'm surprised at the description that it's "useless" as a programming / design partner. Even if it doesn't make "elegant" code (whatever that means), it's the difference between an app existing at all, or not.

I built and shipped a Swift app to the App Store, currently generating $10,200 in MRR, exclusively using LLMs.

I wouldn't describe myself as a programmer, and didn't plan to ever build an app, mostly because in the attempts I made, I'd get stuck and couldn't google my way out.

LLMs are the great un-stickers. For that reason per se, they are incredibly useful.

yogrish · a year ago
May I know what is the name of app that is built using LLM? 10k MRR is highly successful app.
yogrish commented on A stubborn computer scientist accidentally launched the deep learning boom   arstechnica.com/ai/2024/1... · Posted by u/LorenDB
yogrish · a year ago
Now the fourth element is the seminal paper " Attention is all you need" which has taken AI into next level with openAI LLMs and the likes. Another story that rightly fits in here. https://www.ft.com/content/37bb01af-ee46-4483-982f-ef3921436...
yogrish commented on How I ship projects at big tech companies   seangoedecke.com/how-to-s... · Posted by u/gfysfm
yogrish · a year ago
this is a reality in organisation: " it’s paradoxically often better for you if there is some kind of problem that forces a delay, for the same reason that the heroic on-call engineer who hotfixes an incident gets more credit than the careful engineer who prevents one." Its ironic in many areas. A leader (Project or Political) who ran projects and shipped smoothly is less valued than a leader who created a mess and got them fixed ...who is more celebrated. :(
yogrish commented on AI engineers claim new algorithm reduces AI power consumption by 95%   tomshardware.com/tech-ind... · Posted by u/ferriswil
djoldman · a year ago
https://arxiv.org/abs/2410.00907

ABSTRACT

Large neural networks spend most computation on floating point tensor multiplications. In this work, we find that a floating point multiplier can be approximated by one integer adder with high precision. We propose the linear-complexity multiplication (L-Mul) algorithm that approximates floating point number multiplication with integer addition operations. The new algorithm costs significantly less computation resource than 8-bit floating point multiplication but achieves higher precision. Compared to 8-bit floating point multiplications, the proposed method achieves higher precision but consumes significantly less bit-level computation. Since multiplying floating point numbers requires substantially higher energy compared to integer addition operations, applying the L-Mul operation in tensor processing hardware can potentially reduce 95% energy cost by elementwise floating point tensor multiplications and 80% energy cost of dot products. We calculated the theoretical error expectation of L-Mul, and evaluated the algorithm on a wide range of textual, visual, and symbolic tasks, including natural language understanding, structural reasoning, mathematics, and commonsense question answering. Our numerical analysis experiments agree with the theoretical error estimation, which indicates that L-Mul with 4-bit mantissa achieves comparable precision as float8 e4m3 multiplications, and L-Mul with 3-bit mantissa outperforms float8 e5m2. Evaluation results on popular benchmarks show that directly applying L-Mul to the attention mechanism is almost lossless. We further show that replacing all floating point multiplications with 3-bit mantissa L-Mul in a transformer model achieves equivalent precision as using float8 e4m3 as accumulation precision in both fine-tuning and inference.

yogrish · a year ago
we used to use Fixed point multiplications (Q Format) in DSP algorithms on different DSP architectures. https://en.wikipedia.org/wiki/Q_(number_format). They used to be so fast and near accurate to floating point multiplications. Probably we need to use those DSPs blocks as part of Tensors/GPUs to realise both fast multiplications & parallelisms.

u/yogrish

KarmaCake day1165August 22, 2011View Original