anthuswilliams (u/anthuswilliams)

anthuswilliams commented on Experts Have World Models. LLMs Have Word Models latent.space/p/adversaria... · Posted by u/aaronng91

skeptic_ai · 2 days ago

I don’t get it from your message why am llm can’t do it

Related: Have you seen nvidea with their simulated 3d env. That might not be called llm but it’s not very far away from what our llm actually do right now. It’s just a naming difference

anthuswilliams · 17 hours ago

This argument was specifically about LLMs, not about other techniques (RL, multi-armed bandit, etc) that might be better leveraged to accomplish this type of goal.

An LLM which makes a tool call to a function called `ride_bike`, where that function is a different sort of model with a different set of feedback mechanisms than those available to the LLM, is NOT the same thing at all. The LLM hasn't "learned" to ride the bike. The best you can say is that the LLM has learned that the bike can be ridden, and that it has a way of asking some other entity to ride on its behalf.

Now, could you develop such a model and make it available to an LLM? Sure, probably. But that's not an LLM. Moreover, it involves you, a human, making novel inroads on a different sort of AI/robotics problem. It simply is not possible to accomplish with an LLM.

anthuswilliams commented on Eight more months of agents crawshaw.io/blog/eight-mo... · Posted by u/arrowsmith

pjc50 · a day ago

> Right now every app feels like a walled garden, with broken UX, constant redesigns, enormous amounts of telemetry and user manipulation

OK, but: that's an economic situation.

> so much less scope for engagement-hacking, dark patterns, useless upselling, and so on.

Right, so there's less profit in it.

To me it seems this will make the market more adversarial, not less. Increasing amounts of effort will be expended to prevent LLMs interacting with your software or web pages. Or in some cases exploit the user's agentic LLM to make a bad decision on their behalf.

anthuswilliams · a day ago

Maybe. Or maybe services will switch to charging per API call or whatever instead of monthly or per-seat. Who can predict the future?

I mean, services _could_ make it harder to use LLMs to interact with them, but if agents are popular enough they might see customers start to revolt over it.

anthuswilliams commented on Eight more months of agents crawshaw.io/blog/eight-mo... · Posted by u/arrowsmith

dmk · 2 days ago

The real insight buried in here is "build what programmers love and everyone will follow." If every user has an agent that can write code against your product, your API docs become your actual product. That's a massive shift.

anthuswilliams · 2 days ago

I'm very much looking forward to this shift. It is SO MUCH more pro-consumer than the existing SaaS model. Right now every app feels like a walled garden, with broken UX, constant redesigns, enormous amounts of telemetry and user manipulation. It feels like every time I ask for programmatic access to SaaS tools in order to simplify a workflow, I get stuck in endless meetings with product managers trying to "understand my use case", even for products explicitly marketed to programmers.

Using agents that interact with APIs represents people being able to own their user experience more. Why not craft a frontend that behaves exactly the the way YOU want it to, tailor made for YOUR work, abstracting the set of products you are using and focusing only on the actual relevant bits of the work you are doing? Maybe a downside might be that there is more explicit metering of use in these products instead of the per-user licensing that is common today. But the upside is there is so much less scope for engagement-hacking, dark patterns, useless upselling, and so on.

anthuswilliams commented on Experts Have World Models. LLMs Have Word Models latent.space/p/adversaria... · Posted by u/aaronng91

CamperBob2 · 3 days ago

What is in the nature of bike-riding that cannot be reduced to text?

You know transformers can do math, right?

anthuswilliams · 3 days ago

> What is in the nature of bike-riding that cannot be reduced to text?

You're asking someone to answer this question in a text forum. This is not quite the gotcha you think it is.

The distinction between "knowing" and "putting into language" is a rich source of epistemological debate going back to Plato and is still widely regarded to represent a particularly difficult philosophical conundrum. I don't see how you can make this claim with so much certainty.

anthuswilliams commented on 2025 Letter danwang.co/2025-letter/... · Posted by u/Amorymeltzer

kalkin · a month ago

I have nonspecific positive associations with Dan Wang's name, so I rolled my eyes a bit but kept going when "If the Bay Area once had an impish side, it has gone the way of most hardware tinkerers and hippie communes" was followed up by "People aren’t reminiscing over some lost golden age..."

But I stopped at this:

> “AI will be either the best or the worst thing ever.” It’s a Pascal’s Wager

That's not what Pascal's wager is! Apocalyptic religion dates back more than two thousand years and Blaise Pascal lived in the 17th century! When Rosa Luxemburg said to expect "socialism or barbarism", she was not doing a Pascal's Wager! Pascal's Wager doesn't just involve infinite stakes, but also infinitesimal probabilities!

The phrase has become a thought-terminating cliche for the sort of person who wants to dismiss any claim that stakes around AI are very high, but has too many intellectual aspirations to just stop with "nothing ever happens." It's no wonder that the author finds it "hard to know what to make of" AI 2027 and says that "why they put that year in their title remains beyond me."

It's one thing to notice the commonalities between some AI doom discourse and apocalyptic religion. It's another to make this into such a thoughtless reflex that you also completely muddle your understanding of the Christian apologetics you're referencing. There's a sort of determined refusal to even grasp the arguments that an AI doomer might make, even while writing an extended meditation on AI, for which I've grown increasingly intolerant. It's 2026. Let's advance the discourse.

anthuswilliams · a month ago

I'm not sure I understand your complaint. Is it that he misuses the term Pascal's Wager? Or more generally that he doesn't extend enough credibility to the ideas in AI 2027?

anthuswilliams commented on Drugmakers raise US prices on 350 medicines despite pressure reuters.com/business/heal... · Posted by u/JumpCrisscross

lotsofpulp · a month ago

> but the central point is that people who have to USE their insurance (i.e. sick people) subsidize the premiums of people who don't (healthy people), and this critique applies regardless of age.

You’re losing me here. This claim is categorically false. You cannot consider only the deductible when calculating who subsidizes who.

The only way to calculate it is premiums + deductible + out of pocket maximum = total healthcare costs. And the subsidy via premium is so large that it negates effects of a deductible and out of pocket maximum.

Note that all plans have to be actuarially equivalent, regardless of what deductible you choose. The actuaries have to account for rebates and other pricing strategies when ensuring actuarial equivalence, so that the ratio of what the plan pays versus what you pay meets the required ratio for that metal level.

https://www.healthcare.gov/choose-a-plan/plans-categories/

Since your health is not a factor in pricing your insurance, it has to be that people less likely to need healthcare pay for the people likely to need healthcare.

It is the same as if the government forbade auto insurers from using moving violations history, or life insurers from using health measures, or home insurers from using flood maps.

anthuswilliams · a month ago

The claim about who subsidizes who was always hyperbole, I'll grant you that. I included the statement to make the point that this is the phenomenon people are referring to when they make that statement.

I happen to think there is validity to the statement if you control for other actuarial factors. But if you don't think that makes sense as a lens through which to look at the problem, I won't quibble, even though I disagree. We're also only talking about drug prices here, which is a small portion of overall healthcare spending.

In any case, the central point, that insurers benefit from higher prices, still stands.

anthuswilliams commented on Drugmakers raise US prices on 350 medicines despite pressure reuters.com/business/heal... · Posted by u/JumpCrisscross

lotsofpulp · a month ago

> Contrast this with a no-rebate world with cheaper/more transparent pricing. Fewer patients would hit their out of pocket maximum.

And premiums would go up. Every insurer has to get their premium approved by every state’s insurance regulator, and every state’s insurance regulator is not going to allow them to have more than a few percent of profit.

> They can use the rebates they get from the providers to subsidize the insured, allowing them to offer lower premiums and gain market share. This is what people mean when they say "In America, the sick people pay to subsidize the health care of the healthy people".

I’ve never heard of this, and it’s legally not allowed. The ACA mandates insurers price plans so that old people only pay at most 3x what young people pay. And the ACA does not allow insurers to charge more to people likelier to need healthcare. Mathematically, that means younger and healthier people pay higher premiums so that older and sicker people can have lower premiums.

NY state goes even further and says all ages pay the same premium, so young subsidizes old even more. MA has a 2x cap, I believe. And then of course, FICA taxes mean the young and working are paying for the healthcare for the old and non working, the vast majority of all healthcare spend in the US (Medicare).

anthuswilliams · a month ago

> And premiums would go up.

Yes. As I wrote above, insurers compete on premiums, and they do do so by using rebates to subsidize those premiums by spreading patients' deductibles across the insured population. As far as profits go, I can't speak to regulatory issues since they will vary by state, but in any case the same critique would apply if insurers are pocketing a fixed percentage of a larger amount.

Re your second point, it completely twists my point and is largely irrelevant. Yes, older people paying the same premiums as younger people is a counter-argument in that older people are more likely to need healthcare, but the central point is that people who have to USE their insurance (i.e. sick people) subsidize the premiums of people who don't (healthy people), and this critique applies regardless of age. Now, one could argue that the structural factors that control costs across age cohorts counterbalances this phenomenon. And I'd agree with you! But that doesn't negate the original point that insurance companies benefit from, and advocate for, high sticker prices.

anthuswilliams commented on Drugmakers raise US prices on 350 medicines despite pressure reuters.com/business/heal... · Posted by u/JumpCrisscross

lotsofpulp · a month ago

Pharmaceutical companies, hospitals, and doctors are free to charge by the medicine, by the night, and by the minute.

For example, this place does it:

https://surgerycenterok.com/surgery-prices/

Insurance companies do not force the sellers to use complex billing practices, they would benefit from more transparent pricing (since they are seeking to pay less).

The root cause is healthcare is inherently complicated and complex, it has a problem of supply being nowhere near demand, and since prices for things are so high (including liability), there is a lot of cover your ass and fraud prevention going on.

anthuswilliams · a month ago

Insurance companies absolutely benefit from the higher and opaque prices, because they negotiate rebates with providers. This allows them to maximize patient copays and ensures they hit their deductible, i.e. paying as much as possible under their respective insurance plans. Contrast this with a no-rebate world with cheaper/more transparent pricing. Fewer patients would hit their out of pocket maximum.

They can use the rebates they get from the providers to subsidize the insured, allowing them to offer lower premiums and gain market share. This is what people mean when they say "In America, the sick people pay to subsidize the health care of the healthy people".

Of course, that above only applies if there is competitive pressure. If there is no competitive pressure (e.g. in states with only one or two insurers), they can keep premiums high and book as profit the difference between what the patient paid out and what the patient would have paid out in a lower-cost no-rebate world.

anthuswilliams commented on Skills for organizations, partners, the ecosystem claude.com/blog/organizat... · Posted by u/adocomplete

deaux · 2 months ago

I'll give you a short reply, as another person who finds MCP very useful. I think a big gap is that MCP's are often marketed as "taking actions" for you, because that's flashy and looks cool in the eyes of laymen. While most of their actual value is the opposite, in using them to gather information to take better non-MCP actions. Connecting them to logs, read-only to (e.g. mock) databases, knowledge bases, and so on. All for querying, not for create/update/delete.

anthuswilliams · 2 months ago

Agree with this framing. They are like RAG setups that you can compose together without needing to build a dedicated app to do it.

anthuswilliams commented on Skills for organizations, partners, the ecosystem claude.com/blog/organizat... · Posted by u/adocomplete

verelo · 2 months ago

This is fascinating. I really appreciate the length reply.

How do you handle versioning/updates when datasets change? Do the MCPs break or do you have some abstraction layer?

What's your hit rate on researchers actually converting LLM explorations into permanent artifacts vs just using it as a one-off?

Makes sense for research workflows. Do you think this pattern (LLM exploration > traditional tools) generalizes outside domains with high uncertainty? Or is it specifically valuable where 'deciding what to do' is the hard part?

Someone else mentioned using Chrome dev tools + Cursor, I'm going to try that one out as a way to convince myself here. I want to make this work but I just feel like I'm missing something. The problem is clearly me, so I guess i need to put in some time here.

anthuswilliams · 2 months ago

> How do you handle versioning/updates when datasets change?

For data MCPs, we use remote MCPs that are served over an stdio bridge. So our configuration is just mcp-proxy[0] pointed at a fixed URL we control. The server has an /mcp endpoint that provides tools and that endpoint is hit whenever the desktop LLM starts up. So adding/removing/altering tools is simply a matter of changing that service and redeploying that API. (Note: There are sometimes complications, e.g. if I change an endpoint that used to return data directly, but now it writes a file to cloud storage and returns a URL (because the result is to large, i.e. to work around the aforementioned broken factor of MCP) we have to sync with our IT team to deploy a configuration change to everyone's machine.)

I have seen nicer implementations that use a full MCP gateway that does another proxy step to the upstream MCP servers, which I haven't used myself (though I want to). The added benefit is that you can log/track which MCPs your users are using most often and how they are doing, and you can abstract away a lot of the details of auth, monitor for security issues, etc. One of the projects I've looked at in that space is Mint MCP, but I haven't used it myself.

> What's your hit rate on researchers actually converting LLM explorations into permanent artifacts vs just using it as a one-off?

Low. Which in our case is ideal, since most research ideas can be quickly discarded and save us a ton of time and money that would otherwise be spent running doomed lab experiments, etc. As you get later in the drug discovery pipeline you have a larger team built around the program, and then the artifacts are more helpful. There still isn't much of a norm in the biotech industry of having an engineering team support an advanced drug program (a mistake, IMO) so these artifacts go a long way given these teams don't have dedicated resources.

> Do you think this pattern (LLM exploration > traditional tools) generalizes outside domains with high uncertainty?

I don't know for sure, as I don't live in that world. My instinct is: I wouldn't necessarily roll something like this out to external customers if you have a well-defined product. (IMO there just isn't that much of a market for uncertain outputs of such products, which is why all of the SaaS companies that have launched their integrated AI tools haven't seen much success with them.) But even within a domain like that, it can be useful to e.g. your customer support team, your engineers, etc. For example, one of the ideas on my "cool projects" list is an SRE toolkit that can query across K8s, Loki/Prometheus, your cloud provider, your git provider and help quickly diagnose production issues. I imagine the result of such an exploration would almost always be a new dashboard/alert/etc.

[0] https://github.com/sparfenyuk/mcp-proxy - don't know much about this repo, but it was our starting point