ainch (u/ainch) - Readit News

ainch commented on Yann LeCun raises $1B to build AI that understands the physical world wired.com/story/yann-lecu... · Posted by u/helloplanets

visarga · 3 days ago

> Whenever I see claims about AGI being reachable through large language models, it reminds me of the miasma theory of disease.

Whenever I see people think the model architecture matters much, I think they have a magical view of AI. Progress comes from high quality data, the models are good as they are now. Of course you can still improve the models, but you get much more upside from data, or even better - from interactive environments. The path to AGI is not based on pure thinking, it's based on scaling interaction.

To remain in the same miasma theory of disease analogy, if you think architecture is the key, then look at how humans dealt with pandemics... Black Death in the 14th century killed half of Europe, and none could think of the germ theory of disease. Think about it - it was as desperate a situation as it gets, and none had the simple spark to keep hygiene.

The fact is we are also not smart from the brain alone, we are smart from our experience. Interaction and environment are the scaffolds of intelligence, not the model. For example 1B users do more for an AI company than a better model, they act like human in the loop curators of LLM work.

ainch · 2 days ago

It's unintuitive to me that architecture doesn't matter - deep learning models, for all their impressive capabilities, are still deficient compared to human learners as far as generalisation, online learning, representational simplicity and data efficiency are concerned.

Just because RNNs and Transformers both work with enormous datasets doesn't mean that architecture/algorithm is irrelevant, it just suggests that they share underlying primitives. But those primitives may not be the right ones for 'AGI'.

ainch commented on Yann LeCun raises $1B to build AI that understands the physical world wired.com/story/yann-lecu... · Posted by u/helloplanets

Tenoke · 3 days ago

I think LeCun has been so consistently wrong and boneheaded for basically all of the AI boom, that this is much, much more likely to be bad than good for Europe. Probably one of the worst people to give that much money to that can even raise it in the field.

ainch · 2 days ago

LeCun was stubbornly 'wrong and boneheaded' in the 80s, but turned out to be right. His contention now is that LLMs don't truly understand the physical world - I don't think we know enough yet to say whether he is wrong.

ainch commented on I put my whole life into a single database howisfelix.today/... · Posted by u/lukakopajtic

gardenhedge · 3 days ago

Isn't this a drop in the ocean? Why would any 'normal' person forgo flying? How much CO2 emissions have 'world leaders' produced going to summits, or Taylor Swift and her fans flaying to concerts or war flights?

ainch · 3 days ago

The general concern around Taylor Swift's emissions has always struck me as shortsighted. Her Eras tour is estimated to have generated around $5bn in economic uplift in the US, at an estimated 10,000 tonnes CO2e for her personal travel. Even if the total footprint is higher, that is thousands of times lower than the emissions intensity of an industry like fast fashion. From an environmental point of view, attending a Taylor Swift show is a much less carbon-intensive way to spend your money than ordering from Temu.

ainch commented on From Noise to Image – interactive guide to diffusion lighthousesoftware.co.uk/... · Posted by u/simedw

adammarples · 12 days ago

If the prompt is the compass, and represents a point in space, why walk there? Why not just go to that point in image space directly, what would be there? When does the random seed matter if you're aiming at the same point anyway, don't you end up there? Does the prompt vector not exist in the image manifold, or is there some local sampling done to pick images which are more represented in the training data?

ainch · 12 days ago

One way of thinking about diffusion is that you're learning a velocity field from unlikely to likely images in the latent space, and that field changes depending on your conditioning prompt. You start from a known starting point (a noise distribution), and then take small steps following the velocity field, eventually ending up at a stable endpoint (which corresponds to the final image). Because your starting point is a random sample from a noise distribution, if you pick a slightly different starting point (seed), you'll end up at a slightly different endpoint.

You can't jump to the endpoint because you don't know where it is - all you can compute is 'from where I am, which direction should my next step be.' This is also why the results for few-step diffusion are so poor - if you take big jumps over the velocity field you're only going in approximately the right direction, so you won't end up at a properly stable point which corresponds to a "likely" image.

ainch commented on Julia: Performance Tips docs.julialang.org/en/v1/... · Posted by u/tosh

ziotom78 · 14 days ago

Correct, but I would add: Julia is better than Python+NumPy/SciPy when you need extreme speed in custom logic that can’t be easily vectorized. As Julia is JIT-compiled, if your code calls most of the functions just once it won’t provide a big advantage, as the time spent compiling functions can be significant (e.g., if you use some library heavily based on macros).

To produce plots out of data files, Python and R are probably the best solutions.

ainch · 14 days ago

Even then, if you're familiar with NumPy it's pretty easy to switch to Jax's NumPy API, and then you can easily jit in Python as well.

ainch commented on Statement from Dario Amodei on our discussions with the Department of War anthropic.com/news/statem... · Posted by u/qwertox

sbinnee · 14 days ago

As a non US citizen, this article sounds mildly concerning to me. My country is an ally of US. Good. But I don't know how I would feel when I start seeing Anthropic logos on every weapon we buy from US.

Aside my concern, Dario Amodei seems really into politics. I have read a couple of his blog posts and listened to a couple of podcast interviews here and there. Every time I felt like he sounded more like a politician than an entrepreneur.

I know Anthropic is particularly more mission-driven than, say OpenAI. And I respect that their constitutional ways of training and serving Claude models. Claude turned out to be a great success. But reading a manifest speaking of wars and their missions, it gives me chills.

ainch · 14 days ago

The most chilling thing imo is that Anthropic is the only lab that have said anything about this. Google and OpenAI presumably signed up to all these terms without any protest.

ainch commented on SynthID: A tool to watermark and identify content generated through AI deepmind.google/models/sy... · Posted by u/tosh

manbash · 14 days ago

It's nice that they explain the "what" (...it is doing) but not the "why". Who is going to use it and for what reasons?

Also, if it's essentially a sort of metadata, can't the output generated image be replicated (e.g. screenshot) and thus stripped of any such data?

ainch · 14 days ago

I've heard of journalists using it to try and figure out whether images sent by sources were generated. In their Nano Banana 2 release blogpost, Google mentioned that SynthID has been used ~20 million times, so there's clearly some interest in identifying AI-generated images.

ainch commented on I don't know how you get here from “predict the next word” grumpy-economist.com/p/re... · Posted by u/qsi

wavemode · 15 days ago

Statistical models generalize. If you train a model that f(x) = 5 and f(x+1) = 6, the number 7 doesn't have to exist in the training data for the model to give you a correct answer for f(x+2)

Similarly, if there are millions of academic papers and thousands of peer reviews in the training data, a review of this exact paper doesn't need to be in there for the LLM to write something convincing. (I say "convincing" rather than "correct" since, the author himself admits that he doesn't agree with all the LLM's comments.)

I tend to recommend people learn these things from first principles (e.g. build a small neural network, explore deep learning, build a language model) to gain a better intuition. There's really no "magic" at work here.

ainch · 15 days ago

Sorry but this is famously not true! There is no guarantee that statistical models generalise. In your example, whether or not your model generalises depends entirely on what f(x) you use - depending on the complexity of your function class f(x+2) could be 7, 8, or -500.

One of the surprises of deep learning is that it can, sometimes, defy prior statistical learning theory to generalise, but this is still poorly understood. Concepts like grokking, double descent, and the implicit bias of gradient descent are driving a lot of new research into the underlying dynamics of deep learning. But I'd say it is pretty ahistoric to claim that this is obvious or trivial - decades of work studied "overfitting" and related problems where statistical models fail to generalise or even interpolate within the support of their training data.

ainch commented on Mercury 2: Fast reasoning LLM powered by diffusion inceptionlabs.ai/blog/int... · Posted by u/fittingopposite

nylonstrung · 16 days ago

I'm not sold on diffusion models.

Other labs like Google have them but they have simply trailed the Pareto frontier for the vast majority of use cases

Here's more detail on how price/performance stacks up

https://artificialanalysis.ai/models/mercury-2

ainch · 16 days ago

This understates the possible headroom as technical challenges are addressed - text diffusion is significantly less developed than autoregression with transformers, and Inception are breaking new ground.

ainch commented on Mercury 2: Fast reasoning LLM powered by diffusion inceptionlabs.ai/blog/int... · Posted by u/fittingopposite

refulgentis · 16 days ago

I'm very worried for both.

Cerebras requires a $3K/year membership to use APIs.

Groq's been dead for about 6 months, even pre-acquisition.

I hope Inception is going well, it's the only real democratic target at this. Gemini 2.5 Flash Lite was promising but it never really went anywhere, even by the standards of a Google preview

ainch · 16 days ago

I don't think it's a good comparison given Inception work on software and Cerebras/Groq work on hardware. If Inception demonstrate that diffusion LLMs work well at scale (at a reasonable price) then we can probably expect all the other frontier labs to copy them quickly, similarly to OpenAI's reasoning models.