Readit News logoReadit News
fergal_reid commented on Ask HN: How can ChatGPT serve 700M users when I can't run one GPT-4 locally?    · Posted by u/superasn
fergal_reid · 4 months ago
I think the most direct answer is that at scale, inference can be batched, so that processing many queries together in a parallel batch is more efficient than interactively dedicating a single GPU per user (like your home setup).

If you want a survey of intermediate level engineering tricks, this post we wrote on the Fin AI blog might be interesting. (There's probably a level of proprietary techniques OpenAI etc have again beyond these): https://fin.ai/research/think-fast-reasoning-at-3ms-a-token/

fergal_reid commented on Agency vs. Control vs. Reliability in Agent Design   fin.ai/research/agency-co... · Posted by u/destraynor
gota · 8 months ago
This is not core the to article, but -

> However, most examples of high agency agents operate in ideal environments which provide complete knowledge to the agent, and are ‘patient’ to erroneous or flaky interactions. That is, the agent has access to the complete snapshot of its environment at all times, and the environment is forgiving of its mistakes.

This is a long way around basic (very basic) existing concepts in classic AI ("Fully" vs "partially" observable environments; nondeterministic actions; are all, if I'm not sorely mistaken, discussed in Russel and Norvig - the standard textbook for AI 101).

Now, maybe the authors _do_ know and choose to discuss this at pre-undergrad level like this on purpose. Regardless of whether they don't know the basic concepts or think its better to pretend they don't because their audience doesn't - this signals that folks working on/with the new AI wave (authors or audience) have not read the basic literature of classic AI

The bastardization of the word 'agent' is related to this, but I'll stop here at the risk of going too far as 'old man yells at clouds', as if I'd never seen technical terms coopted for hype

fergal_reid · 8 months ago
Yes we're familiar with the terminology and framing of gofai. Fwiw I read (most of) the 3rd edition of Russell and Norvig in my undergrad days.

However, the point we're trying to make here is at a higher level of abstraction.

Basically most demos of agents you see these days don't prioritize reliability. Even a Copilot use case is quite a bit less demanding than a really frustrated user trying to get a refund or locate a missing order.

I'm not sure putting that in the language of pomdps is going to improve things for the reader, rather than just make us look more well read.

But your feedback is noted!

fergal_reid commented on Agency vs. Control vs. Reliability in Agent Design   fin.ai/research/agency-co... · Posted by u/destraynor
DebtDeflation · 8 months ago
They use Customer Service as their domain example. This is an area that I've spent the last decade applying AI/NLP to. 99% of tasks in this domain are fully deterministic - get information about a product, place an order, cancel an order, get status on an order, process a return, troubleshoot a product, etc. These are all well-defined processes that should make use of orchestrated workflows that inject AI at the appropriate time (to determine the customer's intent using a classifier, to obtain information for slot-filling using NER, and lately to return information from a knowledgebase using RAG). Once we know what the customer wants to do and the information required, we execute the workflow. There's no reason to use an "autonomous agent" to engage in reasoning and planning. Unless we just want to drive up token costs for some reason.
fergal_reid · 8 months ago
At Intercom we've also a lot of experience here.

I disagree, basically. In our experience actual real world processes are not compactly defined, and don't have sharp edges.

When you actually go to pull them out of a customer they have messy probabilistic edges, where you can sometimes make progress a lot faster, and end up with a much more compact and manageable representation of the process, by leveraging an LLM.

We've a strong opinion this is the future of the space and that purely deterministic workflows will get left behind! I guess we'll see.

fergal_reid commented on 2025 AI Index Report   hai.stanford.edu/ai-index... · Posted by u/INGELRII
namaria · 8 months ago
It's overkill. The models do not capture knowledge about coding. They overfit to the dataset. When one distills data into a useful model the model can be used to predict future behavior of the system.

That is the premise of LLM-as-AI. By training these models on enough data, knowledge of the world is purported as having been captured, creating something useful that can be leveraged to process new input and get a prediction of the trajectory of the system in some phase space.

But this, I argue, is not the case. The models merely overfit to the training data. Hence the variable results perceived by people. When their intentions and prompt fit to the data in the training, the model appears to give good output. But the situation and prompt do not, the models do no "reason" about it and "infer" anything. It fails. It gives you gibberish or go in circles, or worse if there is some "agentic" arrangement if fails to terminate and burns tokens until you intervene.

It's overkill. And I am pointing out it is overkill. It's not a clever system for creating code for any given situation. It overfits to training data set. And your response is to claim that my argument is something else, not that it's overkill but that it can only kill dead things. I never said that. I see it's more than capable of spitting out useful code even if that exact same code is not in the training dataset. But it is just automating the process of going through google, docs and stack overflow and assembling something for you. You might be good at searching and lucky and it is just what you need. You might not be so used to using the right keywords or just be using some uncommon language, or in a domain that happens to not be well represented and then it feels less useful. But instead of just coming up short as search, the model overkills and wastes your time and god knows how much subsidized energy and compute. Lucky you if you're not burning tokens on some agentic monstosity.

fergal_reid · 8 months ago
You are correct that variable results could be a symptom of a failure to generalise well beyond the training set.

Such failure could happen if the models were overfit, or for other reasons. I don't think 'overfit', which is pretty well defined, is exactly the word you mean to use here.

However, I respectfully disagree with your claim. I think they are generalising well beyond the training dataset (though not as far beyond as say a good programmer would - at least not yet). I further think they are learning semantically.

Can't prove it in a comment except to say that there's simply no way they'd be able to successfully manipulate such large pieces of code, using English language instructions, it they weren't great at generalisation and ok at understanding semantics.

fergal_reid commented on 2025 AI Index Report   hai.stanford.edu/ai-index... · Posted by u/INGELRII
simonw · 8 months ago
I've been writing code with LLM assistance for over two years now and I've had plenty of situations where I am 100% confident the thing I am doing has never been done by anyone else before.

I've tried things like searching all of the public code on GitHub for every possible keyword relevant to my problem.

... or I'm writing code against libraries which didn't exist when the models were trained.

The idea that models can only write code if they've seen code that does the exact same thing in the past is uninformed in my opinion.

fergal_reid · 8 months ago
Strongly agree.

This seems to be very hard for people to accept, per the other comments here.

Until recently I was willing to accept an argument that perhaps LLMs had mostly learned the patterns; e.g. to maybe believe 'well there aren't that many really different leetcode questions'.

But with recent models (eg sonnet-3.7-thinking) they are operating well on such large and novel chunks of code that the idea they've seen everything in the training set, or even, like, a close structural match, is becoming ridiculous.

fergal_reid commented on What made the Irish famine so deadly   newyorker.com/magazine/20... · Posted by u/pepys
fergal_reid · 9 months ago
As an Irish person when I saw the article title, I was immediately sceptical.

I personally believe most articles about the famine shy away from the horror of it, and also from a frank discussion.

Going to give some subjective opinion here: people generally downplay the role of the British government and ruling class in it.

Why? One personal theory - growing up in the 80s in Ireland there was a lot of violence in the north. (Most) Irish people who were educated or middle class were worried about basically their kids joining the IRA, and so kind of downplayed the historical beef with the British. That's come through in the culture.

There's also kind of a fight over the historical narrative with the British, maybe including the history establishment, who yes care a lot about historical accuracy, but, also, very subjectively, see the world through a different lens, and often come up through British institutions that view the British empire positively.

It's often easier to say the famine was the blight, rather than political. (They do teach the political angle in schools in Ireland; but I think it's fair to say it's contested or downplayed in the popular understanding, especially in Britain.)

However that article is written by a famous Irish journalist and doesn't shy away from going beyond that.

Perhaps a note of caution - even by Irish standards he'd be left leaning, so would be very politically left by American standards; he's maybe prone to emphasize the angle that the root cause was lassiez-faire economic and political policies. (I'm not saying it wasn't.)

I personally would emphasize more the fact that the government did not care much about the Irish people specifically. The Irish were looked down on as a people; and also viewed as troublesome in the empire.

Some government folks did sympathize, of course, and did try to help.

But I personally do not think the famine would have happened in England, no matter how lassiez-faire the economic policies of the government. A major dimension must be a lack of care for the Irish people, over whom they were governing; and there are instances of people in power being glad to see the Irish being brought low:

"Public works projects achieved little, while Sir Charles Trevelyan, who was in charge of the relief effort, limited government aid on the basis of laissez-faire principles and an evangelical belief that “the judgement of God sent the calamity to teach the Irish a lesson”." per the UK parliament website!

It's not an easy thing to come to terms with even today. I recently recorded a video talking about how fast the build out of rail infrastructure was, in the UK, as an analogy for how fast the AI infra build out could be; and I got a little quesy realizing that during the Irish potato famine the UK was spending double digit GDP percent on rail build out. Far sighted, yes, and powering the industrial revolution, but wow, doing that while mass exporting food from the starving country next door, yikes.

fergal_reid commented on Some thoughts on autoregressive models   wonderfall.dev/autoregres... · Posted by u/Wonderfall
fergal_reid · 9 months ago
Similar arguments to LeCun.

People are going to keep saying this about autoregressive models, how small errors accumulate and can't be corrected, while we literally watch reasoning models say things like "oh that's not right, let me try a different approach".

To me, this is like people saying "well NAND gates clearly can't sort things so I don't see how a computer could".

Large transformers can clearly learn very complex behavior, and the limits of that are not obvious from their low level building blocks or training paradigms.

fergal_reid commented on Modern-Day Oracles or Bullshit Machines? How to thrive in a ChatGPT world   thebullshitmachines.com... · Posted by u/ctbergstrom
fergal_reid · 10 months ago
I think the authors misunderstand what's actually going on.

I think this is the crux:

>They are vastly more powerful than what you get on an iPhone, but the principle is similar.

This analogy is bad.

It is true that the _training objective_ of LLMs during pretraining might be next token prediction, but that doesn't mean that 'your phone's autocomplete' is a good analogy, because systems can develop far beyond what their training objective might suggest.

Literally humans, optimized to spread their genes, have developed much higher level faculties than you might naively guess from the simplicity of the optimisation objective.

If the behavior of top LLMs didn't convince you of this, they clearly develop much more powerful internal representations than an autocomplete does, are much more capable etc.

I would point to papers like Othello-gpt, or lines of work on mechanistic interpretability, by Anthropic, and others, as very compelling evidence.

I think that, contrary to the authors, using words like 'understand' and 'think' for these systems is much more helpful than to conceptualise them as autocomplete.

The irony is that many people are autocompleting from the training objective to the limits of the system; or from generally being right by calling BS on AI, to concluding it's right to call BS here.

fergal_reid commented on Why is everything based on likelihoods even though likelihoods are so small?   stats.stackexchange.com/q... · Posted by u/cl3misch
blt · 2 years ago
I think the "mass" they are referring to might the mass of the Bayesian posterior in parameter space, not the mass of the data in event space.
fergal_reid · 2 years ago
Yes, in parameter space.

However, TobyTheCamel's point is valid in that there are some parameter spaces where the MLE is going to be much less useful than others.

Even without having to go to high dimensions, if you've got a posterior that looks like a normal distribution, the MLE is going to the you a lot, whereas if it's a multimodal distribution with a lot of mass scattered around, knowing the MLE much less informative.

But this is a complex topic to address in general, so I'm trying to stick to what I see as the intuition behind the original question!

u/fergal_reid

KarmaCake day3480September 13, 2010
About
@gmail.com: fergal.reid

https://www.linkedin.com/in/fergalreid VP of AI @Intercom PhD

Previous user id was 'feral'

Previously cofounder of Synference Predictive Analytics, applied Contextual Bandit / Reinforcement Learning tech, then PM of Optimizely Predictive Analytics team, following acquisition of Synference.

All views my own.

Coauthor of a paper on anonymity in Bitcoin, hence my interest in Bitcoin topics http://arxiv.org/abs/1107.4524

View Original