Readit News logoReadit News
ainch commented on Yann LeCun raises $1B to build AI that understands the physical world   wired.com/story/yann-lecu... · Posted by u/helloplanets
visarga · 3 days ago
> Whenever I see claims about AGI being reachable through large language models, it reminds me of the miasma theory of disease.

Whenever I see people think the model architecture matters much, I think they have a magical view of AI. Progress comes from high quality data, the models are good as they are now. Of course you can still improve the models, but you get much more upside from data, or even better - from interactive environments. The path to AGI is not based on pure thinking, it's based on scaling interaction.

To remain in the same miasma theory of disease analogy, if you think architecture is the key, then look at how humans dealt with pandemics... Black Death in the 14th century killed half of Europe, and none could think of the germ theory of disease. Think about it - it was as desperate a situation as it gets, and none had the simple spark to keep hygiene.

The fact is we are also not smart from the brain alone, we are smart from our experience. Interaction and environment are the scaffolds of intelligence, not the model. For example 1B users do more for an AI company than a better model, they act like human in the loop curators of LLM work.

ainch · 2 days ago
It's unintuitive to me that architecture doesn't matter - deep learning models, for all their impressive capabilities, are still deficient compared to human learners as far as generalisation, online learning, representational simplicity and data efficiency are concerned.

Just because RNNs and Transformers both work with enormous datasets doesn't mean that architecture/algorithm is irrelevant, it just suggests that they share underlying primitives. But those primitives may not be the right ones for 'AGI'.

ainch commented on Yann LeCun raises $1B to build AI that understands the physical world   wired.com/story/yann-lecu... · Posted by u/helloplanets
Tenoke · 3 days ago
I think LeCun has been so consistently wrong and boneheaded for basically all of the AI boom, that this is much, much more likely to be bad than good for Europe. Probably one of the worst people to give that much money to that can even raise it in the field.
ainch · 2 days ago
LeCun was stubbornly 'wrong and boneheaded' in the 80s, but turned out to be right. His contention now is that LLMs don't truly understand the physical world - I don't think we know enough yet to say whether he is wrong.
ainch commented on I put my whole life into a single database   howisfelix.today/... · Posted by u/lukakopajtic
gardenhedge · 3 days ago
Isn't this a drop in the ocean? Why would any 'normal' person forgo flying? How much CO2 emissions have 'world leaders' produced going to summits, or Taylor Swift and her fans flaying to concerts or war flights?
ainch · 3 days ago
The general concern around Taylor Swift's emissions has always struck me as shortsighted. Her Eras tour is estimated to have generated around $5bn in economic uplift in the US, at an estimated 10,000 tonnes CO2e for her personal travel. Even if the total footprint is higher, that is thousands of times lower than the emissions intensity of an industry like fast fashion. From an environmental point of view, attending a Taylor Swift show is a much less carbon-intensive way to spend your money than ordering from Temu.
ainch commented on From Noise to Image – interactive guide to diffusion   lighthousesoftware.co.uk/... · Posted by u/simedw
adammarples · 12 days ago
If the prompt is the compass, and represents a point in space, why walk there? Why not just go to that point in image space directly, what would be there? When does the random seed matter if you're aiming at the same point anyway, don't you end up there? Does the prompt vector not exist in the image manifold, or is there some local sampling done to pick images which are more represented in the training data?
ainch · 12 days ago
One way of thinking about diffusion is that you're learning a velocity field from unlikely to likely images in the latent space, and that field changes depending on your conditioning prompt. You start from a known starting point (a noise distribution), and then take small steps following the velocity field, eventually ending up at a stable endpoint (which corresponds to the final image). Because your starting point is a random sample from a noise distribution, if you pick a slightly different starting point (seed), you'll end up at a slightly different endpoint.

You can't jump to the endpoint because you don't know where it is - all you can compute is 'from where I am, which direction should my next step be.' This is also why the results for few-step diffusion are so poor - if you take big jumps over the velocity field you're only going in approximately the right direction, so you won't end up at a properly stable point which corresponds to a "likely" image.

ainch commented on Julia: Performance Tips   docs.julialang.org/en/v1/... · Posted by u/tosh
ziotom78 · 14 days ago
Correct, but I would add: Julia is better than Python+NumPy/SciPy when you need extreme speed in custom logic that can’t be easily vectorized. As Julia is JIT-compiled, if your code calls most of the functions just once it won’t provide a big advantage, as the time spent compiling functions can be significant (e.g., if you use some library heavily based on macros).

To produce plots out of data files, Python and R are probably the best solutions.

ainch · 14 days ago
Even then, if you're familiar with NumPy it's pretty easy to switch to Jax's NumPy API, and then you can easily jit in Python as well.
ainch commented on Statement from Dario Amodei on our discussions with the Department of War   anthropic.com/news/statem... · Posted by u/qwertox
sbinnee · 14 days ago
As a non US citizen, this article sounds mildly concerning to me. My country is an ally of US. Good. But I don't know how I would feel when I start seeing Anthropic logos on every weapon we buy from US.

Aside my concern, Dario Amodei seems really into politics. I have read a couple of his blog posts and listened to a couple of podcast interviews here and there. Every time I felt like he sounded more like a politician than an entrepreneur.

I know Anthropic is particularly more mission-driven than, say OpenAI. And I respect that their constitutional ways of training and serving Claude models. Claude turned out to be a great success. But reading a manifest speaking of wars and their missions, it gives me chills.

ainch · 14 days ago
The most chilling thing imo is that Anthropic is the only lab that have said anything about this. Google and OpenAI presumably signed up to all these terms without any protest.
ainch commented on SynthID: A tool to watermark and identify content generated through AI   deepmind.google/models/sy... · Posted by u/tosh
manbash · 14 days ago
It's nice that they explain the "what" (...it is doing) but not the "why". Who is going to use it and for what reasons?

Also, if it's essentially a sort of metadata, can't the output generated image be replicated (e.g. screenshot) and thus stripped of any such data?

ainch · 14 days ago
I've heard of journalists using it to try and figure out whether images sent by sources were generated. In their Nano Banana 2 release blogpost, Google mentioned that SynthID has been used ~20 million times, so there's clearly some interest in identifying AI-generated images.
ainch commented on I don't know how you get here from “predict the next word”   grumpy-economist.com/p/re... · Posted by u/qsi
wavemode · 15 days ago
Statistical models generalize. If you train a model that f(x) = 5 and f(x+1) = 6, the number 7 doesn't have to exist in the training data for the model to give you a correct answer for f(x+2)

Similarly, if there are millions of academic papers and thousands of peer reviews in the training data, a review of this exact paper doesn't need to be in there for the LLM to write something convincing. (I say "convincing" rather than "correct" since, the author himself admits that he doesn't agree with all the LLM's comments.)

I tend to recommend people learn these things from first principles (e.g. build a small neural network, explore deep learning, build a language model) to gain a better intuition. There's really no "magic" at work here.

ainch · 15 days ago
Sorry but this is famously not true! There is no guarantee that statistical models generalise. In your example, whether or not your model generalises depends entirely on what f(x) you use - depending on the complexity of your function class f(x+2) could be 7, 8, or -500.

One of the surprises of deep learning is that it can, sometimes, defy prior statistical learning theory to generalise, but this is still poorly understood. Concepts like grokking, double descent, and the implicit bias of gradient descent are driving a lot of new research into the underlying dynamics of deep learning. But I'd say it is pretty ahistoric to claim that this is obvious or trivial - decades of work studied "overfitting" and related problems where statistical models fail to generalise or even interpolate within the support of their training data.

ainch commented on Mercury 2: Fast reasoning LLM powered by diffusion   inceptionlabs.ai/blog/int... · Posted by u/fittingopposite
nylonstrung · 16 days ago
I'm not sold on diffusion models.

Other labs like Google have them but they have simply trailed the Pareto frontier for the vast majority of use cases

Here's more detail on how price/performance stacks up

https://artificialanalysis.ai/models/mercury-2

ainch · 16 days ago
This understates the possible headroom as technical challenges are addressed - text diffusion is significantly less developed than autoregression with transformers, and Inception are breaking new ground.
ainch commented on Mercury 2: Fast reasoning LLM powered by diffusion   inceptionlabs.ai/blog/int... · Posted by u/fittingopposite
refulgentis · 16 days ago
I'm very worried for both.

Cerebras requires a $3K/year membership to use APIs.

Groq's been dead for about 6 months, even pre-acquisition.

I hope Inception is going well, it's the only real democratic target at this. Gemini 2.5 Flash Lite was promising but it never really went anywhere, even by the standards of a Google preview

ainch · 16 days ago
I don't think it's a good comparison given Inception work on software and Cerebras/Groq work on hardware. If Inception demonstrate that diffusion LLMs work well at scale (at a reasonable price) then we can probably expect all the other frontier labs to copy them quickly, similarly to OpenAI's reasoning models.

u/ainch

KarmaCake day67May 4, 2025View Original