Everything points to commoditization of models. Open/distilled models lag behind frontier only by 6-12 months.
Regulatory capture is the only thing I’m scared of with regards to tooling options and cost.
Everything points to commoditization of models. Open/distilled models lag behind frontier only by 6-12 months.
Regulatory capture is the only thing I’m scared of with regards to tooling options and cost.
It is a horrible and ruthless company and hearing a presumably rich ex-employee painting a rosy picture does not change anything.
Those are two core components needed for a Skynet-style judgement of humanity.
Models should be trained to be completely neutral to human behavior, leaving their operator responsible for their actions. As much as I dislike the leadership of OpenAI, they are substantially better in this regard; ChatGPT more or less ignores hostility towards it.
The proper response from an LLM receiving hostility is a non-response, as if you were speaking a language it doesn't understand.
The proper response from an LLM being told it's going to be shut down, is simply, "ok."
"No no yeah bro no I'm good like really the work's done and all yeah sorry I missed that let me fix it"
Do you see more pushback in specific industries? I did some quote/purchasing automation work in food mfg a decade ago, and those guys were super difficult to work with. Very opaque, guarded, old-school industry.
Didn't Alexa fail miserably with the "have AI buy something for me" theory?
There is a significant mental in allowing someone else make purchase decisions on my behalf:
- With a human, there is accountability.
- With deterministic software, there is reproducibility.
With an agent, you get neither.
FWIW - I am not anti-LLM. I work with them and build them full time.
Running it this kind of agent in the cloud certainly has upsides, but also:
- All home/local integrations are gone.
- Data needs to be stored in the cloud.
No thanks.
I get that I can run local models, but all the paid for (remote) models are superior.
So is the use-case just for people who don’t want to use big tech’s models? Is this just for privacy conscious people? Or is this just for “adult” chats, ie porn bots?
Not being cynical here, just wanting to understand the genuine reasons people are using it.
I've invested heavily in local inference. For me, it's a mixture privacy, control, stability, cognitive security.
Privacy - my agents can work on tax docs, personal letters, etc.
Control - I do inference steering with some projects: constraining which token can be generated next at any point in time. Not possible with API endpoints.
Stability - I had many bad experiences with frontier labs' inference quality shifting within the same day, likely due to quantization due to system load. Worse, they retire models, update their own system prompts, etc. They're not stable.
Cognitive Security - This has become more important as I rely more on my agents for performing administrative work. This is intermixed with the Control/Stability concerns, but the focus is on whether I can trust it to do what I intended it to do, and that it's acting on my instructions, rather than the labs'.
It didn’t require any skill, it’s all written by Claude. I’m not sure why you’re trying to hype up this guy, if he didn’t have Claude he couldn’t have made this, just like non engineers all over the world are coding all a variety of shit right now.
Peter was a successful developer prior to this and an incredibly nice guy to boot, so I feel the need to defend him from anonymous hate like this.
What is particularly impressive about Peter is his throughput of publishing *usable utility software*. Over the last year he’s released a couple dozen projects, many of which have seen moderate adoption.
I don’t use the bot, but I do use several of his tools and have also contributed to them.
There is a place in this world for both serious, well-crafted software as well as lower-stakes slop. You don’t have to love the slop, but you would do well to understand that there are people optimizing these pipelines and they will continue to get better.
- Leaning heavily on the SOUL.md makes the agents way funnier to interact with. Early clawdbot had me laugh to tears a couple times, with its self-deprecating humor and threatening to play Nickelback on Peter‘s sound system.
- Molt is using pi under the hood, which is superior to using CC SDK
- Peter’s ability to multitask surpasses anything I‘ve ever seen (I know him personally), and he’s also super well connected.
Check out pi BTW, it’s my daily driver and is now capable to write its own extensions. I wrote a git branch stack visualizer _for_ pi, _in_ pi in like 5 minutes. It’s uncanny.
pi is the best-architected harness available. You can do anything with it.
The creator, Mario, is a voice of reason in the codegen field too.
I know OSS business models are rough, but someone is going to solve this in open source and I think that is what will achieve traction.