I genuinely don't understand why some people are still bullish about LLMs

I get so confused on this. I play around, test, and mess with LLMs all the time and they are miraculous. Just amazing, doing things we dreamed about for decades. I mean, I can ask for obscure things with subtle nuance where I misspell words and mess up my question and it figures it out. It talks to me like a person. It generates really cool images. It helps me write code. And just tons of other stuff that astounds me.

And people just sit around, unimpressed, and complain that ... what ... it isn't a perfect superintelligence that understands everything perfectly? This is the most amazing technology I've experienced as a 50+ year old nerd that has been sitting deep in tech for basically my whole life. This is the stuff of science fiction, and while there totally are limitations, the speed at which it is progressing is insane. And people are like, "Wah, it can't write code like a Senior engineer with 20 years of experience!"

Crazy.

WhyOhWhyQ · 5 months ago

The technology is not just less than superintelligence, for many applications it is less than prior forms of intelligence like traditional search and Stack Exchange, which were easily accessible 3 years ago and are in the process of being displaced by LLMs. I find that outcome unimpressive.

And this Tweeter's complaints do not sound like a demand for superintelligence. They sound like a demand for something far more basic than the hype has been promising for years now. - "They continue to fabricate links, references, and quotes, like they did from day one." - "I ask them to give me a source for an alleged quote, I click on the link, it returns a 404 error." (Why have these companies not manually engineered out a problem like this by now? Just do a check to make sure links are real. That's pretty unimpressive to me.) - "They reference a scientific publication, I look it up, it doesn't exist." - "I have tried Gemini, and actually it was even worse in that it frequently refuses to even search for a source and instead gives me instructions for how to do it myself." - "I also use them for quick estimates for orders of magnitude and they get them wrong all the time. " - "Yesterday I uploaded a paper to GPT to ask it to write a summary and it told me the paper is from 2023, when the header of the PDF clearly says it's from 2025. "

Thlom · 5 months ago

A municipality in Norway used LLM to create a report about the school structure in the municipality (how many schools are there, how many should there be, where should they be, how big should they be, pros and cons of different size schools and classes etc etc). Turns out the LLM invented scientific papers to use as references and the whole report is complete and utter garbage based on hallucinations.

Terretta · 5 months ago

"They continue to fabricate links, references, and quotes, like they did from day one." - "I ask them to give me a source for an alleged quote, I click on the link, it returns a 404 error."

Why have these companies not manually engineered out a problem like this by now? Just do a check to make sure links are real. That's pretty unimpressive to me.

There are no fabricated links, references, or quotes, in OpenAI's GPT 4.5 + Deep Research.

It's unfortunate the cost of a Deep Research bespoke white paper is so high. That mode is phenomenal for pre-work domain research. You get an analyst's two week writeup in under 20 minutes, for the low cost of $200/month (though I've seen estimates that white paper cost OpenAI over USD 3000 to produce for you, which explains the monthly limits).

You still need to be a domain expert to make use of this, just as you need to be to make use of an analyst. Both the analyst and Deep Research can generate flawed writeups with similar misunderstandings: mis-synthesizing, misapplication, or missing inclusion of some essential.

Neither analyst nor LLM is a substitute for mastery.

waffletower · 5 months ago

This assessment is incomplete. Large languages models are both less and more than these traditional tools. They have not subsumed them and all can sit together in separate tabs of a single browser window. They are another resource, and when the conditions are right, which is often the case in my experience, they are a startlingly effective tool for navigating the information landscape. The criticism of Gemini is a fair one, and I encountered it yesterday, but perhaps with 50% less entitlement. But Gemini also helped me translate obscure termios APIs to python from C source code I provided. The equivalent using search and/or Stack Overflow would have required multiple piecemeal searches without guarantees -- and definitely would have taken much more time.

casey2 · 5 months ago

The 404 links are hilarious, like you can't even parse the output and retry until it returns a link that doesn't 404? Even ignoring the billions in valuation, this is so bad for a $20 sub.

eric_cc · 5 months ago

The tweeters complaints sound like a user problem. LLM’s are tools. How you use them, when you use them, and what you expect out of them should be based on the fact they are tools.

Deleted Comment

whamlastxmas · 5 months ago

I’m sorry but the experience of coding with an LLM is about ten billion times better than googling and stack overflowing every single problem I come across. I’ve stack overflowed maybe like two things in the past half year and I’m so glad to not have to routinely use what is now a very broken search engine and web ecosystem.

Dead Comment

agentcoops · 5 months ago

The whole point is that an LLM is not a search engine and obviously anyone who treats it as one is going to be unsatisfied. It's just not a sensible comparison. You should compare working with an LLM to working with an old "state of the art" language tool like Python NLTK -- or, indeed, specifying a problem in Python versus specifying it in the form of a prompt -- to understand the unbridgeable gap between what we have today and what seemed to be the best even a few years ago. I understand when a popular science author or my relatives haven't understood this several years after mass access to LLMs, but I admit to being surprised when software developers have not.

Hosted and free or subscription-based DeepResearch like tools that integrate LLMs with search functionality (the whole domain of "RAG" or "Retrieval Augmented Generation") will be elementary for a long time yet simply because the cost of the average query starts to go up exponentially and there isn't that much money in it yet. Many people have and will continue to build their own research tools where they can determine how much compute time and API access cost they're willing to spend on a given query. OCR remains a hard problem, let alone appropriately chunking potentially hundreds of long documents into context length and synthesizing the outputs of potentially thousands of LLM outputs into a single response.

somenameforme · 5 months ago

It's mostly because of how they were initially marketed. In an effort to drive hype 'we' were promised the world. Remember the "leaks" from Google about an engineer trying to get the word out that they had created a sentient intelligence? In reality Bard, let alone whatever early version he was using, is about as sentient as my left asscheek.

OpenAI did similar things by focusing to the point of absurdity on 'safety' for what was basically a natural language search engine that has a habit of inventing nonsensical stuff. But on that same note (and also as you alluded to) - I do agree that LLMs have a lot of use as natural language search engines in spite of their proclivity to hallucinate. Being able to describe a e.g. function call (or some esoteric piece of history) by description and then often get the precise term/event that I'm looking for is just incredibly useful.

But LLMs obviously are not sentient, are not setting us on the path to AGI, or any other such nonsense. They're arguably what search engines should have been 10 or 15 years ago, but anti-competitive monopolization of the industry meant that search engine technology progress basically stalled out, if not regressed for the sake of ads (and individual 'entrepreneurs' becoming better at SEO), about the time Google fully established itself.

chimprich · 5 months ago

> Remember the "leaks" from Google about an engineer trying to get the word out that they had created a sentient intelligence?

I presume you are referring to this Google engineer, who was sacked for making the claim. Hardly an example of AI companies overhyping the tech; precisely the opposite, in fact. https://www.bbc.co.uk/news/technology-62275326

It seems to be a common human hallucination to imagine that large organisations are conspiring against us.

jibal · 5 months ago

> Remember the "leaks" from Google about an engineer trying to get the word out that they had created a sentient intelligence?

That's not what happened. Google stomped hard on Lemoine, saying clearly that he was wrong about LaMDA being sentient ... and then they fired him for leaking the transcripts.

Your whole argument here is based on false information and faulty logic.

thaumasiotes · 5 months ago

> OpenAI did similar things by focusing to the point of absurdity on 'safety' for what was basically a natural language search engine that has a habit of inventing nonsensical stuff.

The focus on safety, and the concept of "AI", preexisted the product. An LLM was just the thing they eventually made; it wasn't the thing they were hoping to make. They applied their existing beliefs to it anyway.

fifticon · 5 months ago

I am worried about them as a substitute for search engines. My reasoning is that classic google web-scraping and SEO, as shitty as it may be, is 'open-source' (or at least, 'open-citation') in nature - you can 'inspect the sh*t it's built from'. Whereas LLMs, to me seem like a chinese - or western - totalitarian political system wet dream - 'we can set up an inscrutable source of "truth" for the people to use, with the _truths_ we intend them to receive'. We already saw how weird and unsane this was, when they were configured to be woke under the previous regime. Imagine it being configured for 'the other post-truth' is a nightmare.

f6v · 5 months ago

> Remember the "leaks" from Google about an engineer trying to get the word out that they had created a sentient intelligence?

No, first time I hear about it. I guess the secret to happiness is not following leaks. I had very low expectations before trying LLMs and I’m extremely impressed now.

emsign · 5 months ago

They have their value in analyzing huge amounts of data for example scientific papers or raw observations, but the popular public ones are mostly trained on stolen/pirated texts offthe internet and from social media clouds the companies control. So this means: bullshit in -> bullshit out. I don't need machines for that the regular human bullshitters do this job just fine.

TeMPOraL · 5 months ago

Nobody promised the world. The marketing underpromised and LLMs overdelivered. Safety worries didn't come from marketing, it came from people who were studying this as a mostly theoretical worry for the next 50+ years, only to see major milestones crossed a decade or more before they expected.

Did many people overhype LLMs? Yes, like with everything else (transhumanist ideas, quantum physics). It helps being more picky who one listens to, and whether they're just painting pretty pictures with words, or actually have something resembling a rational argument in there.

babyent · 5 months ago

Bro AGI as a marketing term is too stale already.

We are now at Artificial SUPER Intelligence.

I’m waiting for Artificial Pro Max Super Duper Intelligence.

snitty · 5 months ago

Folks really over index when an LLM is very good for their use case. And most of the folks here are coders, at which they're already good and getting better.

For some tasks they're still next to useless, and people who do those tasks understandably don't get the hype.

Tell a lab biologist or chemist to use an LLM to help them with their work and they'll get very little useful out of it.

Ask an attorney to use it and it's going to miss things that are blindingly obvious to the attorney.

Ask a professional researcher to use it and it won't come up with good sources.

For me, I've had a lot of those really frustrating experiences where I'm having difficulty on a topic and it gives me utter incorrect junk because there just isn't a lot already published about that data.

I've fed it tricky programming tasks and gotten back code that doesn't work, and that I can't debug because I have no idea what it's trying to do, or I'm not familiar with the libraries it used.

eyegor · 5 months ago

It sounds like you're trying to use these llms as oracles, which is going to cause you a lot of frustration. I've found almost all of them now excel at imitating a junior dev or a drunk PhD student. For example the other day I was looking at acoustic sensor data and I ran it down the trail of "what are some ways to look for repeating patterns like xyz" and 10 minutes later I had a mostly working proof of concept for a 2nd order spectrogram that reasonably dealt with spectral leakage and a half working mel spectrum fingerprint idea. Those are all things I was thinking about myself, so I was able to guide it to a mostly working prototype in very little time. But doing it myself from zero would've taken at least a couple of hours.

But truthfully 90% of work related programming is not problem solving, it's implementing business logic. And dealing with poor, ever changing customer specs. Which an llm will not help with.

bachmeier · 5 months ago

Strongly in agreement. I've tried them and mostly come away unimpressed. If you work in a field where you have to get things right, and it's more work to double check and then fix everything done by the LLM, they're worse than useless. Sure, I've seen a few cases where they have value, but they're not much of my job. Cool is not the same as valuable.

If you think "it can't quite do what I need, I'll wait a little longer until it can" you may still be waiting 50 years from now.

throw234234234 · 5 months ago

My view is that it will be some time before they can as well because of the success in the software domain - not because LLM's aren't capable as a tech but because data owners and practitioners in other domains will resist the change. From the SWE experience, news reports, financial magazines, etc many are preparing accordingly, even if it is a subconscious thing. People don't like change, and don't want to be threatened when it is them at risk - no one wants what happened to artists and now SWE's to happen to their profession. They are happy for other professions to "democratize/commoditize" as long as it isn't them - after all this increases their purchasing power. Don't open source knowledge/products, don't let AI near your vertical domain, continue to command a premium for as long as you can - I've heard variations of this in many AI conversations. Much easier in oligopoly and monopoly like domains and/or domains where knowledge was known to be a moat even when mixed with software as you have more trust competitors won't do the same.

For many industries/people work is a means to earn, not something to be passionate in for its own sake. Its a means to provide for other things in life you are actually passionate about (e.g. family, lifestyle, etc). In the end AI may get your job eventually but if it gets you much later vs other industries/domains you win from a capital perspective as other goods get cheaper and you still command your pre-AI scarcity premium. This makes it easier for them to acquire more assets from the early disrupted industries and shield them from eventual AI taking over.

I'm seeing this directly in software. Less new frameworks/libraries/etc outside the AI domain being published IMO, more apprehension from companies to open source their work and/or expose what they do, etc. Attracting talent is also no longer as strong of a reason to showcase what you do to prospective employees - economic conditions and/or AI make that less necessary as well.

aetherson · 5 months ago

I know at least two attorneys who use LLMs productively.

As with all LLM usage right now, it's a tool and not fit for every purpose. But it has legit uses for some attorney tasks.

anon291 · 5 months ago

> For some tasks they're still next to useless, and people who do those tasks understandably don't get the hype.

This is because programmers talk on the forums that programmers scrape to get data to train the models.

topaz0 · 5 months ago

Honestly it's worse than this. A good lab biologist/chemist will try to use it, understand that it's useless, and stop using it. A bad lab biologist/chemist will try to use it, think that it's useful, and then it will make them useless by giving them wrong information. So it's not just that people over-index when it is useful, they also over-index when it's actively harmful but they think it's useful.

tiahura · 5 months ago

This attorney uses it all day every day.

Dead Comment

gloosx · 5 months ago

The problem Sabine tries to communicate is that reality is different from what the cash-heads behind main commercial models are trying to portray. They push the narrative that they’ve created something akin to human cognition, when in reality, they’ve just optimised prediction algorithms on an unprecedented scale. They are trying to say that they created Intelligence, which is the ability to acquire and apply knowledge and skills, but we all know the only real Intelligence they are creating is the collection of information of military or political value.

The technology is indeed amazing and very amusing, but like all the good things in the hands of corporate overlords, it will be slowly turning into profit-milking abomination.

lnenad · 5 months ago

> They push the narrative that they’ve created something akin to human cognition

This is your interpretation of what these companies are saying. I'd love to see if some company specifically anything like that?

Out of the last 100 years how many inventions have been made that could make any human awe like llms do right now? How many things from today when brought back into 2010 would make the person using it make it feel like they're being tricked or pranked? We already take them for granted even thought they've only been around for less than half of a decade.

LLMs aren't a catch all solution to the world's problems; or something that is going to help us in every facet of our lives; or an accelerator for every industry that exists out there. But at no point in history could you talk to your phone about general topics, get information, practice language skills, build an assistant that teaches your kid about the basics of science, use something to accelerate your work in a many different ways etc...

Looking at llms shouldn't be boolean, it shouldn't be between they're the best thing ever invented vs they're useless; but it seems like everyone presents the issue in this manner and Sabine is part of that problem.

noufalibrahim · 5 months ago

Much as i agree with the point about overhyping from companies, I'd be more sympathetic to this point of view if she acknowledged the merits of the technology.

Yes, it hallucinates and if you replace your brain with one of these things, you won't last too long. However, it can do things which, in the hands of someone experienced, are very empowering. And it doesn't take an expert to see the potential.

As it stands, it sounds like a case of "it's great in practice but the important question is how good it is in theory."

TeMPOraL · 5 months ago

I hate to bring an ad hominem into this, but Sabine is a YouTube influencer now. That's her current career. So I'd assume this Tweet storm is also pushing a narrative on its own, because that's part of doing the work she chose to do to earn a living.

Pinch of salt & all.

tim333 · 5 months ago

LLMs seem akin to parts of human cognition, maybe the initial fast thinking bit when ideas pop up in a second of two. But any human writing a review with links to sources would look them up and check the are they right ones that match the initial idea. Current LLMs don't seem to do that, at least the ones Sabine complains about.

Akin to human cognition but still a few bricks short of a load, as it were.

brookst · 5 months ago

You lay the rhetoric on so thick (“cash heads”, “pushing the narrative”, “corporate overlords”, “profit-making abomination”) that it’s hard to understand your actual claim.

Are you trying to say that LLMs are useful now but you think that will stop being the case at some point in the future?

abecedarius · 5 months ago

If it's just cash-heads pushing a narrative, where do Bengio and Hinton fit? Stuart Russell? Doug Hofstadter?

I mean fine, argue that they're mistaken to be concerned, if that's your belief. But dismissing it all as obvious shilling is not that argument.

jchw · 5 months ago

Look man, and I'm saying this not to you but to everyone who is in this boat; you've got to understand that after a while, the novelty wears off. We get it. It's miraculous that some gigabytes of matrices can possibly interpret and generate text, images, and sound. It's fascinating, it really is. Sometimes, it's borderline terrifying.

But, if you spend too much time fawning over how impressive these things are, you might forget that something being impressive doesn't translate into something being useful.

Well, are they useful? ... Yeah, of course LLMs are useful, but we need to remain somewhat grounded in reality. How useful are LLMs? Well, they can dump out a boilerplate React frontend to a CRUD API, so I can imagine it could very well be harmful to a lot of software jobs, but I hope it doesn't bruise too many egos to point out that dumping out yet another UI that does the same thing we've done 1,000,000 times before isn't exactly novel. So it's useful for some software engineering tasks. Can it debug a complex crash? So far I'm around zero for ten and believe me, I'm trying. From Claude 3.7 to Gemini 2.5, Cursor to Claude Code, it's really hard to get these things to work through a problem the way anyone above the junior dev level can. Almost unilaterally, they just keep digging themselves deeper until they eventually give up and try to null out the code so that the buggy code path doesn't execute.

So when Sabine says they're useless for interpreting scientific publications, I have zero trouble believing that. Scoring high on some shitty benchmarks whose solutions are in the training set is not akin to generalized knowledge. And these huge context windows sound impressive, but dump a moderately large document into them and it's often a challenge to get them to actually pay attention to the details that matter. The best shot you have by far is if the document you need it to reference definitely was already in the training data.

It is very cool and even useful to some degree what LLMs can do, but just scoring a few more points on some benchmarks is simply not going to fix the problems current AI architecture has. There is only one Internet, and we literally lit it on fire to try to make these models score a few more points. The sooner the market catches up to the fact that they ran out of Internet to scrape and we're still nowhere near the singularity, the better.

hansmayer · 5 months ago

100% this. I think we should start producing independent evaluations of these tools for their usefulness, not for whatever made up or convoluted evaluation index the OpenAI, Google or Anthropic throw at us.

raincole · 5 months ago

> the novelty wears off.

Hardly. I pretty much have been using LLM at least weekly (most of the time daily) since GPT3.5. I am still amazed. It's really, really hard to not be bullish for me.

It kinda reminds me the days I learned Unix-like command line. At least once a week, I shouted to me self: "What? There is a one-liner that does that? People use awk/sed/xargs this way??" That's how I feel about LLM so far.

Deleted Comment

viraptor · 5 months ago

> dumping out yet another UI that does the same thing we've done 1,000,000 times before isn't exactly novel

As a yet that's exactly what people get paid to do every day. And if it saves them time, they won't exactly get bored of that feature.

dartharva · 5 months ago

> Well, are they useful? ... Yeah, of course LLMs are useful, but we need to remain somewhat grounded in reality. How useful are LLMs?

They are useful enough that they can passably replace (much more expensive) humans in a lot of noncritical jobs, thus being a tangible tool for securing enterprise bottom lines.

casey2 · 5 months ago

>they can dump out a boilerplate react frontend to a CRUD API

This is so clearly biased that it boarders on parody. You can only get out what you put in. The real use case of current LLMs is that any project that would previously require collaboration can now be down solo with a much faster turnover. Of course in 20 years when compute finally catches up they will just be super intelligent AGI

lolinder · 5 months ago

I see a difference between seeing them as valuable in their current state vs being "bullish about LLMs" in the stock market sense.

The big problem with being bullish in the stock market sense is that OpenAI isn't selling the LLMs that currently exist to their investors, they're selling AGI. Their pitch to investors is more or less this:

> If we accomplish our goal we (and you) will have infinite money. So the expected value of any investment in our technology is infinite dollars. No, you don't need to ask what the odds are of us accomplishing our goal, because any percent times infinity is infinity.

Since OpenAI and all the founders riding on their coat tails are selling AGI, you see a natural backlash against LLMs that points out that they are not AGI and show no signs of asymptotically approaching AGI—they're asymptotically approaching something that will be amazing and transformative in ways that are not immediately clear, but what is clear to those who are watching closely is that they're not approaching Altman's promises.

The AI bubble will burst, and it's going to be painful. I agree with the author that that is inevitable, and it's shocking how few people see it. But also, we're getting a lot of cool tech out of it and plenty of it is being released into the open and heavily commoditized, so that's great!

lostmsu · 5 months ago

I think that people who don't believe LLMs to be AGI are not very good at Venn diagrams. Because they certainly are artificial, general, and intelligent according to any dictionary.

zetazzed · 5 months ago

I feel like LLMs are the same as the leap from "world before web search" to "world after web search." Yeah, in google, you get crap links for sure, and you have to wade through salesy links and random blogs. But in the pre-web-search world, your options were generally "ask a friend who seems smart" or "go to the library for quite a while," AND BOTH OF THOSE OPTIONS HAD PLENTY OF ISSUES. I found a random part in an old arduino kit I bought years ago, and GPT-4o correctly identified it and explained exactly how to hook it up and code for it to me. That is frickin awesome, and it saves me a ton of time and leads me to reuse the part. I used DeepResearch to research car options that fit my exact needs, and it was 100% spot on - multiple people have suggested models that DeepReearch did not identify that would be a fit, but every time I dig in, I find that DeepResearch was right and the alternative actually had some dealbreaker I had specified. Etc., etc.

In the 90s, Robert Metcalfe infamously wrote "Almost all of the many predictions now being made about 1996 hinge on the Internet’s continuing exponential growth. But I predict the Internet, which only just recently got this section here in InfoWorld, will soon go spectacularly supernova and in 1996 catastrophically collapse." I feel like we are just hearing LLM versions of this quote over and over now, but they will prove to be equally accurate.

mdp2021 · 5 months ago

> versions of this quote

Generic. For the Internet, more complex questions would have been "What are the potential benefits, what the potential risks, what will grow faster" etc. The problem is not the growth but what that growth means. For LLMs, the big clear question is "will they stop just being LLMs, and when will they". Progress is seen, but we seek a revolution.

jryan49 · 5 months ago

It would be fine if it were sold that way, but there is so much hype. We're told that it's going to replace all of us and put us all out our jobs. They set the expectations so high. Like remember OpenAI showing a video of it doing your taxes for you? Predictions that super-intelligent AI is going to radically transform society faster than we can keep up? I think that's where most of the backlash is coming from.

autoexec · 5 months ago

> We're told that it's going to replace all of us and put us all out our jobs.

I think this is the source of a lot of the hype. There are people salivating at the thought of no longer needing to employ the peasant class. They want it so badly that they'll say anything to get more investment in LLMs even if it might only ever allow them to fire a fraction of their workers, and even if their products and services suffer because the output they get with "AI" is worse than what the humans they throw away were providing.

They know they're overselling it, but they're also still on their knees praying that by some miracle their LLMs trained on the collective wisdom of facebook and youtube comments will one day gain actual intelligence and they can stop paying human workers.

In the meantime, they'll shove "AI" into everything they can think of for testing and refinement. They'll make us beta test it for them. They don't really care if their AI makes your customer service experience go to shit. They don't care if their AI screws up your bill. They don't care if their AI rejects your claims or you get denied services you've been paying for and are entitled to. They don't care if their AI unfairly denies you parole or mistakenly makes you the suspect of a crime. They don't care if Dr. Sbaitso 2.0 misdiagnoses you. Your suffering is worth it to them as long as they can cut their headcount by any amount and can keep feeding the AI more and more information because just maybe with enough data one day their greatest dream will become reality, and even if that never happens a lot of people are currently making massive amounts of money selling that lie.

The problem is that the bubble will burst eventually. The more time goes by and AI doesn't live up to the hype the harder that hype becomes to sell. Especially when by shoving AI into everything they're exposing a lot of hugely embarrassing shortcomings. Repeating "AI will happen in just 10 more years" gives people a lot of time to make money and cash out though.

On the plus side, we do get some cool toys to play with and the dream of replacing humans has sparked more interest in robotics so it's not all bad.

tempestn · 5 months ago

Yeah, it won't do your taxes for you, but it can sure help you do them yourself. Probably won't put you out of your job either, but it might help you accomplish more. Of course, one result of people accomplishing more in less time is that you need fewer people to do the same amount of work - so some jobs could be lost. But it's also possible that for the most part instead, more will be accomplished overall.

hansmayer · 5 months ago

Forget OpenAI ChatGPT doing your taxes for you. Now Gemini will write up your sales slides about Gouda cheese, stating wrongly in the process that gouda makes up about 50% of all cheese consumption worldwide :) These use-cases are getting more useful by the day ;)

gilbetron · 5 months ago

I mean, it's been like 3 years. 3 years after the web came out was barely anything. 3 years after the first GPU was cool, but not that cool. The past three years in LLMs? Insane.

Things could stall out and we'll have bumps and delays ... I hope. If this thing progresses at the same pace, or speeds up, well ... reality will change.

Or not. Even as they are, we can build some cool stuff with them.

9rx · 5 months ago

> And people just sit around, unimpressed, and complain that ... what ... it isn't a perfect superintelligence that understands everything perfectly?

The trouble is that, while incredibly amazing, mind blowing technology, it falls down flat often enough that it is a big gamble to use. It is never clear, at least to me, what it is good at and what it isn't good at. Many things I assume it will struggle with, it jumps in with ease, and vice versa.

As the failures mount, I admittedly do find it becoming harder and harder to compel myself to see if it will work for my next task. It very well might succeed, but by the time I go to all the trouble to find out it often feels that I may as well just do it the old fashioned way.

If I'm not alone, that could be a big challenge in seeing long-term commercial success. Especially given that commercial success for LLMs is currently defined as 'take over the world' and not 'sustain mom and pop'.

> the speed at which it is progressing is insane.

But same goes for the users! As a result the failure rate appears to be closer to a constant. Until we reach the end of human achievement, where the humans can no longer think of new ways to use LLMs, that is unlikely to change.

linsomniac · 5 months ago

It's becoming clear to me that some people just have vastly different uses and use cases than I do. Summarizing a deep, cutting edge physics paper is I'm sure vasty different than summarizing a web page while I'm browsing HN, or writing a Python plugin for Icinga to monitor a web endpoint that spits out JSON.

The author says they use several LLMs every day and they always produce incorrect results. That "feels" weird, because it seems like you'd develop an intuition fairly quickly for the kinds of questions you'd ask that LLMs can and can't answer. If I want something with links to back up what is being said, I know I should ask Perplexity or maybe just ask a long-form prompt-like question of Google or Kagi. If I want a Python or bash program I'm probably going to ask ChatGPT or Gemini. If I want to work on some code I want to be in Cursor and am probably using Claude. For general life questions, I've been asking Claude and ChatGPT.

Running into the same issue with LLMs over and over for years, with all due respect, seems like the "doing the same thing and expecting different results" situation.

BOOSTERHIDROGEN · 5 months ago

This is so true. I really hope she joins this conversation so we can have a productive discussion and understand what she's actually hoping to achieve.

zifpanachr23 · 5 months ago

The two sides are never going to understand each other because I suspect we work on entirely different things and have radically different workflows. I suspect that hackernews gets more use out of LLMs in general than the average programmer because they are far more likely to be at a web startup and more likely to actually be bottlenecked on how fast you can physically put more code in the file and ship sooner.

If you work on stuff that is at all niche (as in, stack overflow was probably not going to have the answer you needed even before LLMs became popular), then it's not surprising when LLMs can't help because they've not been trained.

For people that were already going fast and needed or wanted to put out more code more quickly, I'm sure LLMs will speed them up even more.

For those of us working on niche stuff, we weren't going fast in the first place or being judged on how quickly we ship in all likelihood. So LLMs (even if they were trained on our stuff) aren't going to be able to speed us up because the bottleneck has never been about not being able to write enough code fast enough. There are architectural and environmental and testing related bottlenecks that LLMs don't get rid of.

barrucadu · 5 months ago

That's a good point, I've personally not got much use out of LLMs (I use them to generate fantasy names for my D&D campaign, but find they fall down for anything complex) - but I've also never got much use out of StackOverflow either.

I don't think I'm working on anything particularly niche, but nor is it cookie-cutter generic either, and that could be enough to drastically reduce their utility.

anon373839 · 5 months ago

Two things can be true: e.g., that LLMs are incredible tech we only dreamed of having, and that they’re so flawed that they’re hard to put to productive use.

I just tried to use the latest Gemini release to help me figure out how to do some very basic Google Cloud setup. I thought my own ignorance in this area was to blame for the 30 minutes I spent trying to follow its instructions - only to discover that Gemini had wildly hallucinated key parts of the plan. And that’s Google’s own flagship model!

I think it’s pretty telling that companies are still struggling to find product-market fit in most fields outside of code completion.

adastra22 · 5 months ago

It really depends on the task. Like Sabine, I’m operating on the very frontier of a scientific domain that is extremely niche. Every single LLM out there is worse than useless in this domain. It spits out incomprehensible garbage.

But ask it to solve some leet code and it’s brilliant.

ttw44 · 5 months ago

The question I ask afterwords then is: is solving some leet code brilliant? Is designing a simple inventory system brilliant if they've all been accomplished already? My answer tends towards no, since they still make mistakes in the process, and it harms newer developers from learning.

wruza · 5 months ago

At non-extremely niche tasks they fail as well.

I should start collecting examples, if only for threads like this. Recently I tried to llm a tsserver plugin that treats lines ending with "//del" as empty. You can only imagine all the sneaky failures in the chat and the total uselessness of these results.

Anything that is not literally millions (billions?) of times in the training set is doomed to be fantasized about by an LLM. In various ways, tones, etc. After many such threads I came to conclusion that people who find it mostly useful are simply treading water as they probably have done most of their career. Their average product is a react form with a crud endpoint and excitement about it. I can't explain their success reports otherwise, cause it rarely works on anything beyond that.

ThrowawayTestr · 5 months ago

Surely you understand why an LLM that has no knowledge of your niche wouldn't be useful right?

UncleEntity · 5 months ago

How many actual humans are useful in your niche scientific domain?

And how many actual humans, with a fair bit of training, can become a little bit less than useless?

I mean, my parents used to have this dog that would just look at you like "go get you own damn ball, stupid human" if you threw a ball around him.

--edit--

and, yes, the dog also made grammatical mistakes.

Der_Einzige · 5 months ago

Sabine is lex Friedman for women. Stay in your lane about quantum physics and stop trying to opine on LLMs. I’m tired of seeing the huge amount of FUD from her.

weego · 5 months ago

"It talks to me like a person"

Because it has a sample size of our collective human knowledge and language big enough to trick our brains into believing that.

As a parallel thought, it reminds of a trick derren brown did. He picked every horse correctly across 6 races. The person who he was picking for was obviously stunned, as were the audience watching it.

The reality of course is just that people couldn't comprehend that he just had to go to extreme and tedious lengths to make this happen. They started with 7000 people and filmed every one like it was going to be the "one" and then the probability pyramid just dropped people out. It was such a vast undertaking of time and effort that we're biased towards believing there must be something really happening here.

LLMs currently are a natural language interface to a Microsoft Encarta like system that is so unbelievably detailed and all encompassing that we risk accepting that there's something more going on there. There isn't.

timcobb · 5 months ago

> Because it has a sample size of our collective human knowledge and language big enough to trick our brains into believing that.

Yes, it's artificial intelligence. It's not the real thing, it's artificial.

cogman10 · 5 months ago

> Wah, it can't write code like a Senior engineer with 20 years of experience!

No, that's not my problem with it. My problem with it is that inbuilt into the models of all LLMs is that they'll fabricate a lot. What's worse, people are treating them as authoritative.

Sure, sometimes it produces useful code. And often, it'll simply call the "doTheHardPart()" method. I've even caught it literally writing the wrong algorithm when asked to implement a specific and well known algorithm. For example, asking it "write a selection sort" and watching it write a bubble sort instead. No amount of re-prompts pushes it to the right algorithm in those cases either, instead it'll regenerate the same wrong algorithm over and over.

Outside of programming, this is much worse. I've both seen online and heard people quote LLM output as if it were authoritative. That to me is the bigger danger of LLMs to society. People just don't understand that LLMs aren't high powered attorneys, or world renown doctors. And, unfortunately, the incorrect perception of LLMs is being hyped both by LLM companies and by "journalists" who are all to ready to simply run with and discuss the press releases from said LLM companies.

Animats · 5 months ago

> inbuilt into the models of all LLMs is that they'll fabricate a lot.

Still the elephant in the room. We need an AI technology that can output "don't know" when appropriate. How's that coming along?

nonethewiser · 5 months ago

> What's worse, people are treating them as authoritative. … I've both seen online and heard people quote LLM output as if it were authoritative.

Thats not an LLM problem. But indeed quite bothersome. Dont tell me what Chatgpt told you. Tell me what you know. Maybe you got it from ChatGPT and verified it. Great. But my jaw kind of drops when people cite an LLM and just assume it’s correct.

mitjam · 5 months ago

The superficial view: „they hallucinate“

The underlying cause: 3rd order ignorance:

3rd Order Ignorance (3OI)—Lack of Process. I have 3OI when I don't know a suitably efficient way to find out I don't know that I don't know something. This is lack of process, and it presents me with a major problem: If I have 3OI, I don't know of a way to find out there are things I don't know that I don't know.

—- not from an llm

My process: use llms and see what I can do with them while taking their Output with a grain of salt.

refurb · 5 months ago

My company just broadly adopted AI. It’s not a tech company and usually late to the game when it comes to tech adoption.

I’m counting down the days when some AI hallucination makes its way all the way to the C-suite. People will get way too comfortable with AI and don’t understand just how wrong it can be.

Some assumption will come from AI, no one will check it and it’ll become a basic business input. Then suddenly one day someone smart will say “thats not true” and someone will trace it back to AI. I know it.

I assume at that point in time there will be some general directive on using AI and not assuming it’s correct. And then AI will slowly go out of favor.

vidarh · 5 months ago

People fabricated a lot too. Yesterday I spent far less time fixing issues in the far more complex and larger changes Claude Code managed to churn out than what the junior developer I worked with needed. Sometimes it's the reverse. But with my time factored in, working with Claude Code is generally more productive for me than working with a junior. The only reason I still work with a junior dev is as an investment into teaching him.

Claude is cheaper, faster, produces better code.

gumboshoes · 5 months ago

But this was true before LLMs. People would and still do take any old thing from an internet search and treat it as true. There is a known, difficult-to-remedy failure to properly adjudicate information and source quality, and you can find it discussed in research prior to the computer age. It is a user problem more than a system problem. In my experience, with the way I interact with LLMs, they are more likely to give me useful output than not, and this is borne out by mainstream non-edge-case academic peer-reviewed work. Useful does not necessarily equal 100% correct, just as a Google search does not. I judge and vet all information, whether from an LLM, search, book, paper, or wherever We can build a straw person who "always" takes LLM output as true and uses it as-is but those are the same people who use most information tools poorly, be they internet search, dictionaries, or even looking in their own files for their own work or sent mail (I say this as an IT professional who has seen the worker types from before the pre-internet days through now). In any case, we use automobiles despite others misusing them. But only the foolish among us completely take our hands off the wheel for any supposed "self-driving" features. While we must prevent and decry the misuse by fools, we cannot let their ignorance hold us back. Let's let their ignorance help make tools, as they help identify more undesirable scenarios.

dcow · 5 months ago

Have you talked to a non-artificial intelligence lately? I’ve got some news for you…

kristopolous · 5 months ago

This is the problem of the internet writ large.

The solution is to be selective and careful like always

gonzobonzo · 5 months ago

> My problem with it is that inbuilt into the models of all LLMs is that they'll fabricate a lot. What's worse, people are treating them as authoritative.

The same is true about the internet, and people even used to use these arguments to try to dissuade people from getting their information online (back when Wikipedia was considered a running joke, and journalists mocked blogs). But today it would be considered silly to dissuade someone from using the internet just because the information there is extremely unreliable.

Many programmers will say Stack Overflow is invaluable, but it's also unreliable. The answer is to use it as a tool and a jumping off point to help you solve your problem, not to assume that its authoritative.

The strange thing to me these days is the number of people who will talk about the problems with misinformation coming from LLMs, but then who seem to uncritically believe all sorts of other misinformation they encounter online, in the media, or through friends.

Yes, you need to verify the information you're getting, and this applies to far more than just LLMs.

BriggyDwiggs42 · 5 months ago

People need time to adapt to llms capacity to spit out nonsense. It’ll take time but I’m sure they will.

troismph · 5 months ago

> What's worse, people are treating them as authoritative

Because in people's experience, LLMs are often correct.

You are right LLMs are not authoritative, but people trust it exactly because they often do produce correct answers.

pkolaczk · 5 months ago

> I've even caught it literally writing the wrong algorithm when asked to implement a specific and well known algorithm

Happened to me as well. Wanted it to quickly write an algorithm for standard deviation over a stream of data, which is a text-book algorithm. It did it almost right, but messed up the final formula and the code gave wrong answers. Weird, considering some correct codes exist for that problem in Wikipedia.

senordevnyc · 5 months ago

It's always perplexing when people talk about LLMs as "it", as if there's only one model out there, and they're all equally accurate.

FWIW, here's 4o writing a selection sort: https://chatgpt.com/share/67e60f66-aacc-800c-9e1d-303982f54d...

HdS84 · 5 months ago

I was part of preparing an offer a few weeks ago. The customer prepared a lot of documents for us - maybe 100 pages on total. Boss insisted on using chatgpt to summarize this stuff and read only the summary. I did a loner, slower, reading and cought on some topics chatgpt outright dropped. Our offer was based on the summary - and fell through because we missed these nuances. But hey, boss did not read as much as previously...

worthless-trash · 5 months ago

I saw someone saying 80% of doctors believe that LLM's are trustworthy consultation partners.

Code created by LLM's doesnt compile, hallucinated API's.. invalid syntax and completely broken logic, why would you trust it with someones life !

geysersam · 5 months ago

> What's worse, people are treating them as authoritative.

So what? People are wrong all the time. What happens when people are wrong? Things go wrong. What happens then? People learn that the way they got their information wasn't robust enough and they'll adapt to be more careful in the future.

This is the way it has always worked. But people are "worried" about LLMs... Because they're new. Don't worry, it's just another tool in the box, people are perfectly capable of being wrong without LLMs.

solumunus · 5 months ago

A tool is good but lots of people are stupid and misuse it… That’s just life buddy.

MattGaiser · 5 months ago

Humans bullshit and hallucinate and claim authority without citation or knowledge. They will believe all manner of things. They frequently misunderstand.

The LLM doesn’t need to be perfect. Just needs to beat a typical human.

LLM opponents aren’t wrong about the limits of LLMs. They vastly overestimate humans.

zozbot234 · 5 months ago

> I mean, I can ask for obscure things with subtle nuance where I misspell words and mess up my question and it figures it out.

If you're lucky it figures it out. If you aren't, it makes stuff up in a way that seems almost purposefully calculated to fool you into assuming that it's figured everything out. That's the real problem with LLM's: they fundamentally cannot be trusted because they're just a glorified autocomplete; they don't come with any inbuilt sense of when they might be getting things wrong.

sothatsit · 5 months ago

I see this complaint a lot, and frankly, it just doesn't matter.

What matters is speeding up how fast I can find information. Not only will LLMs sometimes answer my obscure questions perfectly themselves, but they also help to point me to the jargon I need to use to find that information online. In many areas this has been hugely valuble to me.

Sometimes you do just have to cut your losses. I've given up on asking LLMs for help with Zig, for example. It is just too obscure a language I guess, because the hallucination rate is too high to be useful. But for webdev, Python, matplotlib, or bash help? It is invaluable to me, even though it makes mistakes every now and then.

intrasight · 5 months ago

Humans also get things wrong surprisingly large percentage of the time.

CamperBob2 · 5 months ago

they don't come with any inbuilt sense of when they might be getting things wrong

Spend some time with current reasoning models. Your experience is obsolete if you still hold this belief.

GolDDranks · 5 months ago

I am so confused too. I hold these beliefs at the same time, and I don't feel they don't contradict each other, but apparently for many people some of these do:

- LLMs are a miraculous technology that are capable of tasks far beyond what we believed would be achievable with AI/ML in the near future. Playing with them makes me constantly feel like "this is like sci-fi, this shouldn't be possible with 2025's technology".

- LLMs are fairly clueless for many tasks that are easy enough for humans, and they are nowhere near AGI. It's also unclear whether they scale up towards that goal. They are also worse programmers than people make them to be. (At least I'm not happy with their results.)

- Achieving AGI doesn't seem impossibly unlikely any more, and doing so is likely to be an existentially disastrous event for humanity, and the worst fodder of my nightmares. (Also in the sense of an existential doomsday scenario, but even just the tought of becoming... irrelevant is depressing.)

Having one of these beliefs makes me the "AI hyper" stereotype, another makes me the "AI naysayer" stereotype and yet another makes me the "AI doomer" stereotype. So I guess I'm all of those!

sweetheart · 5 months ago

> but even just the tought of becoming... irrelevant is depressing

In my opinion, there can exist no AI, person, tool, ultra-sentient omniscient being, etc. that would ever render you irrelevant. Your existence, experiences, and perception of reality are all literally irreplaceable, and (again, just my opinion) inherently meaningful. I don't think anyone's value comes from their ability to perform any particular feat to any particular degree of skill. I only say this because I had similar feelings of anxiety when considering the idea of becoming "irrelevant", and I've seen many others say similar things, but I think that fear is largely a product of misunderstanding what makes our lives meaningful.

GolDDranks · 5 months ago

I guess that Sabine's beef with LLM's that they are hyped as a legit "human level assistant" -kind of thing by the business people, which they clearly aren't yet. Maybe I've just managed to... manage my expectations?

strangattractor · 5 months ago

Back when handwriting recognition was a new thing I was greatly impressed by how good it was. This was primarily because being an engineer I knew how difficult the problem is to solve. %90 recognition seemed really good to me.

When I tried to use the technology that %90 meant 1 out of every 10 things I wrote were incorrect. If it had been a keyboard I would have thrown it in the trash. That is were my Palm ended up.

People expect their technology to do things better not almost as well as a human. Waymo with LIDAR hasn't killed people. Tesla, with camera only, has done so multiple times. I will ride in a Waymo never in a Tesla self driving car.

jasondigitized · 5 months ago

Anyone who doesn't understand this either isn't required to use to utility it provides or has no idea how to prompt it correctly. My wife is a bookkeeper. There are some tasks that are a pain in the ass without writing some custom code. In her case, we just saved her about 2 hours by asking Claude to do it. It wrote the code, applied the code to a CSV we uploadrd and gave us exactly what we needed in 2 minutes.

nemothekid · 5 months ago

>Anyone who doesn't understand this either isn't required to use to utility it provides or has no idea how to prompt it correctly.

Almost every counter-criticism of LLMs almost boil down to

1. you're holding it wrong

2. Well, I use it $DAYJOB and it works great for me! (And $DAYJOB is software engineering).

I'm glad your wife was able to save 2 hours of work, but forgive me if that doesn't translate to the trillion dollar valuation OpenAI is claiming. It's strange you don't see the inherent irony in your post. Instead of your wife just directly uploading the dataset and a prompt, she first has to prompt it to write code. There are clear limitations and it looks like LLMs are stuck at some sort of wall.

kasey_junk · 5 months ago

Honest question, how do you validate it?

ffsm8 · 5 months ago

It's definitely a tech that's here to stay, unlike block chain/nfts

But I mirror the confusion why people are still bullish on it. The current valuation for it is because the market thinks that it's able to write code like a senior engineer and have AGI, because that's how they're marketed by the LLM providers.

I'm not even certain if they'll be ubiquitous after the venture capital investments are gone and the service needs to actually be priced without losing money, because they're (at least currently) mostly pretty expensive to run.

42772827 · 5 months ago

There seems to be a widely held misconception that company valuations have any basis in the underlying fundamentals of what the companies do. This is not and has not been the case for several years. The US stock market’s darlings are Kardashians, they are valuable for being valuable the way the Kardashians are famous for being famous.

In markets, perception is reality, and the perception is that these companies are innovative. That’s it.

protocolture · 5 months ago

NFTs are the perfect example.

NFT is still a great tool if you want a bunch of unique tokens as part of a blockchain app. ERC-721 was proven a capable protocol in a variety of projects. What it isn't, and never will be, is an amazing investment opportunity, or a method to collect cool rare apes and go to yacht parties.

LLMs will settle in and have their place too, just not in the forefront of every investors mind.

sothatsit · 5 months ago

I am more than happy to pay for access to LLMs, and models continue to get smaller and cheaper. I would be very surprised if they are not far more widely used in 5 or 10 years time than they are today.

mountainriver · 5 months ago

It makes engineers a hell of a lot more efficient and opens up software to a whole new class of people. There is plenty of data to back this up.

Based on that alone it’s worth quite a lot.

dougb5 · 5 months ago

Even if the VC-backed companies jacked up their prices, the models that I can run on my own laptop for "free" now are magical compared to the state of the art from 2 years ago. Ubiquity may come from everyone running these on their own hardware.

financypants · 5 months ago

Takes like yours are just crazy given the pace of things. We can argue all day if people are "too bullish" or literally on the market size of enterprise AI, but truly, absolutely no one knows how good these things will get and the problems they'll overcome in the next 5 years. You saying "I am confused on why people are still bullish" is implicitly building in some huge assumptions about the near future.

no_wizard · 5 months ago

Most “AI” companies are simply wrapping the ChatGPT API in some form. You can tell from the job posts.

They aren’t building anything themselves. I find this to be disingenuous as best, and is a sign to me of bubble attribution.

I also think that re-branding Machine Learning as AI to also be disingenuous.

These technologies of course have their use cases and excel in some things, but this isn’t the ushering of actual, sapient intelligence, that for the majority of the term’s existence was the de facto agreed standard for the term “AI”. This technology does lack the actual markers of what is generally accepted as intelligence to begin with

brookst · 5 months ago

Remember the quote that IBM thought there would be a total market for maybe 10 or 15 computer computers in the entire world? They were giant, and expensive, and very limited in application.

didibus · 5 months ago

Tesla is valued based on the hope that it'll be the first to full self-driving cars. I don't think stock markets need to make sense, you invest in things that if true, could have huge growth, that's why LLM is being invested in, because alternatives will make you some ROI, but if LLM do break through major disruption in even a handful of large markets, your ROI will be huge.

anon291 · 5 months ago

That's not really true. Just the entertainment value alone is already causing OpenAI to rate limit its systems, and they're buying up significant amounts of NVIDIA's capacity, and NVIDIA itself is buying up significant portions of the entire world's chip-making budget. Even if just limited to entertainment, the value is immense, apparently.

kylebenzle · 5 months ago

That's a funny comparison, I can and do use cryptocurrency to pay web hosting, VPN and a few other things as it's become the native currency of the internet. I love llms too but agree with the parent comment that says it's inevitable they'll be replaced with something better well Bitcoin seems to be sticking around for the long long term.

outside1234 · 5 months ago

It can be an amazing technology and the valuation for companies be completely wrong.

We saw this with the web. Pet.com was not a billion dollar company but the web was real.

I am actually of the belief that LLMs will be amazing but that rank and file companies are going to be the ones that benefit the most.

Just like the internet.

TZubiri · 5 months ago

Bearish/bullish are relative terms, it describes how you feel about X as opposed to how the rest of the world/market feel.

Wowfunhappy · 5 months ago

> because they're (at least currently) mostly pretty expensive to run.

But moore's law should kick in, shouldn't it?

TiredOfLife · 5 months ago

Steam Market that is basically an nft store has been going for 10+ years

abletonlive · 5 months ago

> The current valuation for it is because the market thinks that it's able to write code like a senior engineer and have AGI, because that's how they're marketed by the LLM providers.

No it's not. If it was valued for that it'd be at least 10X what it is now.

kupopuffs · 5 months ago

blockchain is here to stay, dont kid yourself

genidoi · 5 months ago

Blockchain is here to stay, this is way past the point of "believing in the tech" - recently an wss:// order book exchange (Hyperliquid) crossed $1T volume traded, and they started in 2023.

Blockchains are becoming real-time data structures where everyone has admin level read-only access to everyone.

NBJack · 5 months ago

It's more like duct-taping a VR headset to your head, calibrating your environment to a bunch of cardboard boxes and walls, and calling it a holodeck. It actually kinda works until you push at it too hard.

It reminds me a lot of when I first started playing No Man's Sky (the video game). Billions of galaxies! Exotic, one of a kind life forms on every planet! Endless possibilities! I poured hundreds of hours into the game! But, despite all the variety and possibilities, the patterns emerge, and every 'new' planet just feels like a first-person fractal viewer. Pretty, sometimes kinda nifty, but eventually very boring and repetitive. The illusion wore off, and I couldn't really enjoy it anymore.

I have played with a LOT of models over the years. They can be neat, interesting, and kinda cool at times, but the patterns of output and mistakes shatters the illusion that I'm talking to anything but a rather expensive auto-complete.

HSO · 5 months ago

I´m in the same boat and I think it boils down to this: some people are actually quite passive, while others are more active in their use of technology.

It`d take more time for me to flesh this out than I want to give but the basic idea is I am not just sitting there "expecting things". I´ve been puzzled too at why so many people don´t seem to get it or are so frustrated like this lady, and in my observation this is their common element. It just looks very passive to me, the way they seem to use the machines and expect a result to be "given" to them.

PS. It reminds me very strongly of how our parent generation uses computers. Like the whole way of thinking is different, I cannot even understand why they would act certain ways or be afraid of acting in other ways, it´s like they use a different compass or have a very different (and wrong) model in their head of how this thing in front of them works.

dcchambers · 5 months ago

> And people just sit around, unimpressed, and complain that ... what ... it isn't a perfect superintelligence that understands everything perfectly?

IMO there are two distinct reasons for this:

1. You've got the Sam Altman's of the world claiming that LLMs are or nearly are AGI and that ASI is right around the corner. It's obvious this isn't true even if LLMs are still incredibly powerful and useful. But Sam doing the whole "is it AGI?" dance gets old really quick.

2. LLMs are an existential threat to basically every knowledge worker job on the planet. Peoples' natural response to threats is to become defensive.

refurb · 5 months ago

I’m not sure how anyone can claim number 2 is true, unless it’s someone who is a programmer doing mostly grunt code and thinks every knowledge worker job is similar.

Just off the top of my head there are plenty of knowledge worker jobs where the knowledge isn’t public, nor really in written form anywhere. There just simply wouldn’t be anything for AI to train on.

croes · 5 months ago

> LLMs are an existential threat to basically every knowledge worker job on the planet.

Given the typical problems of LLMs they are not. You still need them to check the results. It’s like FSD, impressive when it works, bad if not, scary because you never known beforehand when it’s failing

meowface · 5 months ago

A lot of software developers have an initial bad experience and assume it's terrible and give up on it.

I feel bad for people who haven't yet experienced how useful these models are for programming.

Some also just prefer manually entering everything. Those people I will never understand.

01100011 · 5 months ago

how much time do I need to devote to see anything but garbage?

For reference, I program systems code in C/C++ in a large, proprietary codebase.

My experiences with OpenAI(a year ago or more), and more recently, Cursor, Grok-v3 and Deepseek-r1, were all failures. The later two started out OK and got worse over time.

What I haven't done is asked "AI" to whip up a more standard application. I have some ideas(an ncurses frontend to p4 written in python similar to tig, for instance), but haven't gotten around to it.

I want this stuff to work, but so far it hasn't. Now I don't think "programming" a computer in english is a very good idea anyway, but I want a competent AI assistant to pair program with. To the degree that people are getting results, to me it seems they are leveraging very high-level APIs/libraries of code which are not written by AI and solving well-solved, "common" problems(simple games, simple web or phone apps). Sort of like how people gloss over the heavy lifting done by language itself when they praise the results from LLMs in other fields.

I know it eventually will work. I just don't know when. I also get annoyed by the hype of folks who think they can become software engineers because they can talk to an LLM. Most of my job isn't programming. Most of my job is thinking about what the solution should be, talking to other people like me in meetings, understanding what customers really want beyond what they are saying, and tracking what I'm doing in various forms(which is something I really do want AI to help me with).

Vibe coding is aptly named because it's sort of the VB6 of the modern era. Holy cow! I wrote a Windows GUI App!!!. It's letting non-programmers and semi-programmers(the "I write glue code in Python to munge data and API ins/outs" crowd) create usable things. Cool! So did spreadsheets. So did Hypercard. Andrej tweeting that he made a phone app was kinda cool but also kinda sad. If this is what the hundreds of billions spent on AI(and my bank account thanks you for that) delivers then the bubble is going to pop soon.

panarky · 5 months ago

It doesn't matter how useful it is, there will always be naysayers who can't see it, or who refuse to see it.

That's okay.

It's not my responsibility to convince or convert them.

I prefer to just let them be and not engage.

tempestn · 5 months ago

Far from just programming too. They're useful for so many things. I use it for quickly coming up with shell scripts (or even complex piped commands (or if I'm being honest even simple commands since it's easier than skimming the man page)). But I also use it to bounce ideas off of when negotiating contracts. Or to give me a spoiler-free reminder of a plot point I'd forgotten in a book or TV series. Or to explain legal or taxation issues (which I of course verify, but it points me in the right direction). Or any number of other things.

As the parent says, while far from perfect, they're an incredible aid in so many areas. When used well, they help you produce not just faster but also better results. The only trick really is that you need to treat it as a (very knowledgeable but overconfident) collaborator rather than an oracle.

billy99k · 5 months ago

I love using it to boilerplate code for a new API I want to integrate. Much better than having to manually search. In the near future, not knowing how to effectively use AI to enhance productivity will be a disadvantage to potential employers.

tombert · 5 months ago

I use ChatGPT all the time. I really like it. It's not perfect; how I've described it (and I doubt that I'm unique in this): it's like having a really smart and eager intern at your disposal.

I say "intern" in the sense that its error-prone and kind of inexperienced, but also generally useful. I can ask it to automatically create a lot of the bootstrapping or tedious code that I always dread writing so that I can focus on the fun stuff, which is often the stuff that's pawned off onto interns and junior-level engineers. I think for the most part, when you treat it like that, it lives up to and sometimes even surpasses expectations.

I mean, I can't speak for everyone, but whenever I begin a new project, a large percentage of the first ~3 hours is simply copying and pasting and editing from documentation, either an API I have to call or some bootstrapping code from a framework or just some cruft to make built-in libraries work how you want. I hate doing all that, it actively makes me not want to start a new project. Being able to get ChatGPT to give me stuff that I need to actually get started on my project has made coding a lot more fun for me again. At this point, you can take my LLM from my cold dead hands.

I do think it will keep getting better, but I'm also at a point where even if it never improves I will still keep using it.

Fanmade · 5 months ago

I guess it totally depends on the specific use cases. Just yesterday, I asked Gemini 2.5 for how to test a Laravel artisan command that internally calls another artisan command using the Pest testing framework. Gemini showed me a method that looked perfect. But when I tried to execute it, I got an error that the method doesn't exist. So I asked Gemini about this, and it told me: "Ah, apologies! That's a classic sign that you might be using a version of Laravel older than 5.7." So I told it that I am actually using the latest release (12.3), to which it responded with: "You're right to question things when you get an error! However, let's clarify the Laravel version first.

As of today, March 27, 2025, the latest stable version of Laravel is Laravel 11, which was released in March 2024. Laravel 12 has not been released yet (it's expected roughly in Q1 2026 based on the usual schedule).

Could you please double-check the exact Laravel version you are using?" So it did not believe me and I had to convince it first that I was using a real version. This went on for a while, with Gemini not only hallucinating stuff, but also being very persistent and difficult to convince of anything else.

Well, in the end it was still certain that this method should exist, even though it could not provide any evidence for it and my searching through the internet and the Git history of the related packages did also not provide any results.

So I gave up and tried it with Claude 3.7 which could also not provide any working solution.

In the end, I found an entirely different solution for my problem, but that wasn't based on anything the AIs told me, but just my own thinking and talking to other software developers.

I would not go that far to call these AIs useless. In software development they can help with simple stuff and boilerplate code, and I found them a lot more helpful in creative work. This is basically the opposite from what I would have expected 5 years ago ^^

But for any important tasks, these LLMs are still far too unreliable. They often feel like they have a lot of knowledge, but no wisdom. They don't know how to apply their knowledge ideally, and they often basically brute-force it with a mix of strange creativity and statistical models that are apparently based on a vast amount of internet content that has big parts of troll content and satire.

matthewkayin · 5 months ago

My issue with higher ups pushing LLMs is that what slows me down at work is not having to write the code. I can write the code. If all I had to do was sit down and write code, then I would be incredibly productive because I'm a good programmer.

But instead, my productivity is hampered by issues with org communication, structure, siloed knowledge, lack of documentation, tech debt, and stale repos.

I have for years tried to provide feedback and get leadership to do something about these issues, but they do nothing and instead ask "How have you used AI to improve your productivity?"

hengheng · 5 months ago

Ive had the same experience as you, and also rather recently. I had to learn two lessons: first, what I could trust it with (as with Wikipedia when it was new), and second, what makes sense to ask it (as with YouTube when it was new). Once I got that down, it is one fabulous tool to have on my belt, among many other tools.

Thing is, the LLMs that I use are all freeware, and they run on my gaming PC. Two to six tokens per second are alright honestly. I have enough other things to take care of in the meantime. Other tools to work with.

I don't see the billion dollar business. And even if that existed, the means of production would be firmly in the hands of the people, as long as they play video games. So, have we all tripled our salaries?

If we haven't, is that because knowledge work is a limited space that we are competing in, and LLMs are an equalizer because we all have them? Because I was taught that knowledge work was infinite. And the new tool should allow us to create more, and better, and more thoroughly. And that should get us all paid better.

Right?

Mawr · 5 months ago

Depends on your use case. If you don't need them to be the source of truth, then they work great, but if you do, the experience sucks because they're so unreliable.

The problems start when people start hyperventilating because they think since LLMs can generate tests for a function for you, that they'll be replacing engineers soon. They're only suitable for generating output that you can easily verify to be correct.

twoodfin · 5 months ago

Indeed, isn’t that the point?

LLM training is designed to distill a massive corpus of facts, in the form of token sequences, into a much, much smaller bundle of information that encodes (somehow!) the deep structure of those facts minus their particulars.

They’re not search engines, they’re abstract pattern matchers.

rayiner · 5 months ago

I asked Grok to describe a picture I took of me and my kid at Hilton Head island. Based on the plant life it guesses it was a southeast barrier island in Georgia or the Carolinas. It guessed my age and my son’s age. LLMs are completely insane tech for a 90s kid. The first fundamental advance in tech I’ve seen in my lifetime—like what it must’ve been like for people who used a telephone for the first time, or watched a television.

skydhash · 5 months ago

Flat TVs, digital audio players (the iPod), the smartphone, laptops, the smartwatches,... You have a very selective definition of advance in tech. Compare today (minus LLMs) with any movies depicting life in the nineties and you can see how much tech have developed.

Yizahi · 5 months ago

There are basically 3 categories of LLM users (very roughly).

1. People creating or dealing with imprecise information. People doing SEO spam, people dealing with SEO spam, almost all creative arts people, people writing corporatese- or legalese- documents or mails, etc. For these tasks LLMs are god-like.

2. People dealing with precise information and or facts. For these people LLMs is no better than a parrot.

3. Subset of 2 - programmers. Because of the huge amount of stolen training data, plus almost perfect proofing software is the form of compilers, static analyzers etc. for this case LLMs are more or less usable, the more data was used the better (JS is the best as I understand).

This is why people's reaction is so polarizing. Their results differ.

agentultra · 5 months ago

It’s not that impressive to me, as a programmer.

The crisis in programming hasn’t been writing code. It has been developing languages and tools so that we can write less of it that is easy to verify as correct. These tools generate more code. More than you can read and more than you will want to before you get bored and decide to trust the output. It is trained on the most average code available that could be sucked up and ripped off the Internet. It will regurgitate the most subtle errors that humans are not good at finding. It only saves you time if you don’t bother reading and understanding what it outputs.

I don’t want to think about the potential. It may never materialize. And much of what was promised even a few years ago hasn’t come to fruition. It’s always a few years away. Always another funding round.

Instead we have massive amounts of new demand for liquid methane, infrastructure struggling to keep up, billions of gallons of fresh water wasted, all so that rich kids can vibe code their way to easy money and realize three months later they’ve been hacked and they don’t know what to do. The context window has been lost and they ran out of API credits. Welcome to the future.

myaccountonhn · 5 months ago

Yeah basically this. If I look at how it helps me as an individual, I can totally see how AI can sometimes be useful. If I take a look at what societal effect of AI, it becomes apparent that AI just is a net negative. Some examples:

- AI is great for disinformation

- AI is great at generating porn of women without their consent.

- Open source projects massively struggle as AI scrapers DDOS them.

- AI uses massive amounts of energy and water, most importantly the expectation is that energy usage will rise when we drastically in a world where we need to lower it. If Sam Altman gets his way, we're toast.

- AI makes us intellectually lazy and worse thinkers. We were already learning less and less in school because of our impoverished attention span. This is even worse now with AI.

- AI makes us even more dependent on cloud vendors and third-parties, further creating a fragile supply chain.

Like AI ostensibly empowers us as individuals, but in reality I think it's a disservice, and the ones it truly empowers are the tech giants, as citizens become dumber and even more dependent on them and tech giants amass more and more power.

ddejohn · 5 months ago

I can't believe I had to dig this deep to find this comment.

I have yet to see an AI-generated image that was "really cool".

AI images and videos strike me as the coffee pods of the digital world -- we're just absolutely littering the internet with garbage. And as a bonus, it's also environmentally devastating to the real world!

I live nearby a landfill, and go there often to get rid of yard waste, construction materials, etc. The sheer volume of perfectly serviceable stuff people are throwing out in my relatively small city (<200k) is infuriating and depressing. I think if more people visited their local landfills, they might get a better sense for just how much stuff humans consume and dispose. I hope people are noticing just how much more full of trash the internet has become in the last few years. It seems like it, but then I read this thread full of people that are still hyped about it all and I wonder.

This isn't even to mention the generated text... it's all just so inane and I just don't get it. I've tried a few times to ask for relatively simple code and the results have been laughable.

_heimdall · 5 months ago

I think the tech would have landed better with more people if it wasn't branded as "artificial intelligence."

I don't have a proposal for what a better name would have been, naming things is hard, but AI carries quite a bit of baggage and expectations with it.

nextlevelwizard · 5 months ago

If you ask for obscure things how do you know you are getting right answers? From my experience unless the thing you are looking for is not found easily with a google search LLMs have no hope getting it correct. This is mostly trying to code against obscure API that isn’t well documented and the little documentation there is is spread across multiple wikis. And the LLMs keep hallucinating functions that simply do not exist

csomar · 5 months ago

It is an amazing technology and like crypto/blockchain it is nerdy to understand how it works and play with it. I think there are two things at stake here:

1. Some people are just uncomfortable with it because it “could” replace their jobs.

2. Some people are warning that the ecosystem bubble is significantly out of proportions. They are right and having the whole stock market, companies and US economy attached to LLMs is just down right irresponsible.

zozbot234 · 5 months ago

> Some people are just uncomfortable with it because it “could” replace their jobs.

What jobs are seriously at risk of being totally replaced by LLM's? Even in things like copywriting and natural language translation, which is somewhat of a natural "best case" for the underlying tech, their output is quite sub par compared to the average human's.

naasking · 5 months ago

> And people just sit around, unimpressed, and complain that ... what ... it isn't a perfect superintelligence that understands everything perfectly

Hossenfelder is a scientist. There's a certain level of rigour that she needs to do her job, which is where current LLMs often fall down. Arguably it's not accelerating her work to have to check every single thing the LLM says.

chriskanan · 5 months ago

I use them everyday and they save me so much time and enable me to do things that I wouldn't be able to do otherwise just due to the amount of time it would take.

I think some people just aren't using them correctly or don't understand their limitations.

They are especially helpful for helping me get over thought paralysis when starting new project.

patrick451 · 5 months ago

The frustration of using an LLM is greater than the frustration of doing it myself. If it's going to be a tool, it needs to work. Otherwise, it's just a research a toy.

skywhopper · 5 months ago

They can do fun and interesting stuff, but we keep hearing how they’re going to replace human workers, and too many people in positions of power not only believe they are capable of this, but are taking steps to replace people with LLMs.

But while they are fun to play with, anything that requires a real answer, but can’t be directly and immediately checked, like customer support, scientific research, teaching, legal advice, identifying humans, correctly summarizing text - LLMs are very bad at these things, make up answers, mix contexts inappropriately, and more.

I’m not sure how you can have played with LLMs so much and missed this. I hope you don’t trust what they say about recipes or how to handle legal problems or how to clean things or how to treat disease or any fact-checking whatsoever.

vonneumannstan · 5 months ago

>I’m not sure how you can have played with LLMs so much and missed this. I hope you don’t trust what they say about recipes or how to handle legal problems or how to clean things or how to treat disease or any fact-checking whatsoever.

This is like a GPT3.5 level criticism. o1-pro is probably better at pure fact retrieval than most PhDs in any given field. I challenge you to try it.

In fact take the GPQA test yourself and see how you do then give the same questions to o1. https://arxiv.org/pdf/2311.12022

Traubenfuchs · 5 months ago

Nothing it can do I couldn‘t do myself before: All the information it gives me I could get myself, though admittedly, slower.

I wonder if people that are amazed by LLM lack this information gathering skill.

After all I met plenty of architect and senior level people that just… had zero google and research skills.

stetrain · 5 months ago

The main issue is that if you ask most LLMs to do something they aren't good at, they don't say "Sorry, I'm not sure how to do that yet," they says "Sure, absolutely! Here you go:" and proceed to make things up, provide numbers or code that don't actually add up, and make up references and sources.

To someone who doesn't actually check or have the knowledge or experience to check the output, it sounds like they've been given a real, useful answer.

When you tell the LLM that the API it tried to call doesn't exist it says "Oh, you're right, sorry about that! Here's a corrected version that should work!" and of course that one probably doesn't work either.

ElectronCharge · 5 months ago

Yes. One of my early observations about LLMs was that we've now produced software that regularly lies to us. It seems to be a quite intractable problem. Also, since there's no real visibility as to how an LLM reaches a conclusion, there's no way to validate anything.

One takeaway from this is that labelling LLMs as "intelligent" is a total misnomer. They're more like super parrots.

For software development, there's also the problem of how up to date they are. If they could learn on the fly (or be constantly updated) that would help.

They are amazing in some ways, but they've been over-hyped tremendously.

layoric · 5 months ago

I agree, they are an amazing piece of technology, but the investment and hype doesn't match the reality. This might age like milk, but I don't think OpenAI is going to make it. They burnt $9B to lose $5B in 2024, trying to raise money like their life depends on it.. because their life depends on it. From what I can tell, none of the AI model produces are profiting from their model usage at this point, except maybe Deepseek. This will be a market, they are useful, astonishing impressive even, but IMO they are either going to become waaayy more expensive to use and/or/combo of market/investment will greatly shrink to be sustainable.

chamomeal · 5 months ago

Couldn’t agree more!

When I saw GPT-3 in action in 2023, I couldn’t believe my eyes. I thought I was being tricked somehow. I’d seen ads for “AI-powered” services and it was always the same unimpressive stuff. Then I saw GPT-3 and within minutes I knew it was completely different. It was the first thing I’d ever seen that felt like AI.

That was only a few years ago. Now I can run something on my 8GB MacBook Air that blows GPT-3 out of the water. It’s just baffling to me when people say LLM’s are useless or unimpressive. I use them constantly and I can still hardly believe they exist!!

csallen · 5 months ago

> I use them constantly and I can still hardly believe they exist!!

Exactly how I feel. I probably write 50 prompts/day, and a few times a week I still think, "I can't believe this is real tech."

mac-mc · 5 months ago

LLMs are better at formally verifiable tasks like coding, also coding makes more money on a pure demand basis so development for it gets more resources. In descriptive science fields, it's not great because science fields don't generate a lot of text compared to other things, so the training data is dwarfed by the huge corpus of general internet text. The software industry created the internet and loves using it, so they have published a lot more text in comparison. It can be really bad in bio for example.

timewizard · 5 months ago

Is your testing adversarial or merely anecdotal curiosity? If you don't actively look for it why would you expect to find it?

It's bad technology because it wastes a lot of labor, electricity, and bandwidth in a struggle to achieve what most human beings can with minimal effort. It's also a blatant thief of copyrighted materials.

If you want to like it, guess what, you'll find a way to like it. If you try to view it from another persons use case you might see why they don't like it.

mannykoum · 5 months ago

> can ask for obscure things with subtle nuance where I misspell words and mess up my question and it figures it out. It talks to me like a person. It generates really cool images. It helps me write code. And just tons of other stuff that astounds me.

It is an impressive technology but is it US$244.22bn [1] impressive (I know this stat is supposed to account for computer vision as well but seeing as to how LLMs are now a big chunk of that I think it's a safe assumption)? It's projected to grow to over US$1tr by 2031. That's higher than the market size of commercial aviation at its peak [2]. I'm sorry if I agree that a cool chatbot is not approximately as important as flying.

[1] https://www.statista.com/outlook/tmo/artificial-intelligence...

[2] https://www.statista.com/markets/419/topic/490/aviation/#sta...

qwertox · 5 months ago

An LLM is like a mouse.

You no longer have the console as the primary interface, but a GUI, which 99.9+% of computer users control via a mouse.

You no longer have the screen as the primary interface, but an AUI, which 99.9+% of computer users control via a headset, earbuds, or a microphone and speaker pair.

You mostly speak and listen to other humans, and if you're not reading something they've written, you could have it read to you in order to detach from the screen or paper.

You'll talk with your computer while in the car, while walking, or while sitting in the office.

An LLM makes the computer understand you, and it allows you to understand the computer.

Even if you use smart glasses, you'll mostly talk to the computer generating the displayed results, and it will probably also talk to you, adding information to the displayed results. It's LLMs that enable this.

Just don't focus too much on whether the LLM knows how high Mount Kilimanjaro is; its knowledge of that fact is simply a hint that it can properly handle language.

Still, it's remarkable how useful they are at analyzing things.

LLMs have a bright future ahead, or whatever technology succeeds them.

sussmannbaka · 5 months ago

I don’t even argue that they might get useful at some point, but when I point a mouse at a button and press the button it usually results in a reliable action. When I use the LLM (I have so far tried: Claude, ChatGPT, DeepSeek, Mistral) it does something but that something usually isn’t what I want (~the linked tweet). Prompting, studying and understanding the result and then cleaning up the mess for the low price of an expensive monthly sub leaves me with worse results than if I did the thing myself, usually takes longer and often leaves me with subtle bugs I’m genuinely afraid of growing into exploitable vulnerabilities. Using it strictly as a rubber duck is neat but also largely pointless. Since other people are getting something out of the tech, I’ll just assume that the hammer doesn’t fit my nails.

cs02rm0 · 5 months ago

It's like a mouse that some variable proportion of the time pretends it's moved the cursor and clicked a button, but actually it hasn't and you have to put a lot of work in to find out whether it did or didn't do what you expected.

It used to be annoying enough just having to clean the trackball, but at least you knew when it wasn't working.

Deleted Comment

comte7092 · 5 months ago

I think it’s more that the people who are boosting LLMs are claiming that perfect super intelligence is right around the corner.

Personally, I look back at how many years ago it was that we were seeing claims that truck drivers were all going to lose there jobs and society would tear itself apart over it within the next few years… and yet here we still are.

jayd16 · 5 months ago

Its fine if you don't care about correctness but a lot of people do.

kelleyperry · 5 months ago

I'm completely with you. The technology is absolutely fascinating in its own right.

That said, I do experience frustrations: - Getting enraged when it messes up perfectly good code it wrote just 10 minutes ago - Constantly reminding it we're NOT using jest to write tests - Discovering it's created duplicate utilities in different folders

There's definitely a lot of hand-holding required, and I've encountered limitations I initially overlooked in my optimism.

But here's what makes it worthwhile: LLMs have significantly eased my imposter syndrome when it comes to coding. I feel much more confident tackling tasks that would have filled me with dread a year ago.

I honestly don't understand how everyone isn't completely blown away by how cool this technology is. I haven't felt this level of excitement about a new technology since I discovered I could build my own Flash movies.

damnever · 5 months ago

It depends. For small tasks like summarization or self-contained code snippets, it’s really good—like figuring out how to inspect a binary executable on Linux, or designing a ranking algorithm for different search patterns. If you only want average performance or don’t care much about the details, it can produce reasonable results without much oversight.

But for larger tasks—say, around 2,000 lines of code—it often fails in a lot of small ways. It tends to generate a lot of dead code after multiple iterations, and might repeatedly fail on issues you thought were easy to fix. Mentally, it can get exhausting, and you might end up rewriting most of it yourself. I think people are just tired of how much we expect LLMs to deliver, only for them to fail us in unexpected ways. The LLM is good, but we really need to push to understand its limitations.

schnitzelstoat · 5 months ago

This is true. But it needs to be more than a toy if it is to be economically viable.

So far the industrial applications haven't been that promising, code writing and documentation is probably the most promising but even there it's not like it can replace a human or even substantially increase their productivity.

keiferski · 5 months ago

I think its perception of usefulness depends on how often you ask/google questions. If you are constantly wondering about X thing, LLMs are amazing - especially compared to previous alternatives like googling or asking on Reddit.

If you don’t constantly look for information, they might be less useful.

geuis · 5 months ago

I'm a senior engineer with 20 years of experience and mostly find all of the AI bs for the last couple of years to be occasionally helpful for general stuff but absolutely incompetent when I need help mildly complicated tasks.

I did have a eureka moment the other day with deepseek and a very obscure bug I was trying to tackle. One api query was having a very weird, unrelated side effect. I loaded up cursor with a very extensive prompt and it actually figured out the call path I hadn't been able to track down.

Today, I had a very simple task that eventually only took me half an hour to manually track. But I started with cursor using very similar context as the first example. It just kept repeatedly dreaming up non-existent files in the PR and making suggestions to fix code that doesn't exist.

So what's the worth to my company of my very expensive time? Should I spend 10,20,50 percent of my time trying to get answers from a chatbot, or should I just use my 20 years of experience to get the job done?

sagarpatil · 5 months ago

I’ve been playing with Gemini 2.5 pro throwing all kinds of problems that will help me with personal productivity and it’s mostly one shoting them. I’m still in disbelief tbh. A lot of people who don’t understand how to use LLM effectively will be at an economic disadvantage.

kragen · 5 months ago

Can you give some examples? Do you mean things like "How do I control my crippling anxiety", things like "What highways would be best to take to Chicago", things like "Write me a Python library to parse the file format in this hex dump", or things like "What should I make for dinner"?

gilbetron · 5 months ago

“The growth of the Internet will slow drastically, as the flaw in ‘Metcalfe’s law' becomes apparent: most people have nothing to say to each other! By 2005, it will become clear that the Internet’s impact on the economy has been no greater than the fax machine’s”

Same vibe.

bni · 5 months ago

I don't like tools that can't be trusted to work 100% of the time. Is this hard to grasp?

tessellated · 5 months ago

Same as reading books, Internet, Wikipedia, working towards/keeping your health and fitness, etc...

The quote about books being a mirror reflecting genius or idiocy seems to apply.

I see LLMs a kind of hyper-keyboard. Speeding up typing AND structuring content, completing thoughts, and inspiring ideas.

Unlike a regular keyboard, an LLM transforms input contextually. One no longer merely types but orchestrates concepts and modulates language, almost like music.

Yet mastery is key. Just as a pianist turns keystrokes into a symphony through skill, a true virtuoso wields LLMs not as a crutch but as an amplifier of thought.

kazinator · 5 months ago

As a 50+ nerd, for decades I carried the idea that can't we just build a sufficiently large neural net, throw some data at it and have somehow be usefully intelligent? So it's kind of showing strong signs of something I've been waiting for.

In the 70's I read in some science book for kids about how one day we will likely be able to use light emitting diodes for illumination instead of light bulbs, and this "cold light" will save us lots of energy. Waited out that one too; it turned out so.

cratermoon · 5 months ago

People were impressed by Eliza in 1964.

alabastervlog · 5 months ago

I’m reminded of how I always think current cutting edge good examples of CG in movies looks so real and then, consistently, when I watch it again in 10 years it always looks distractingly shitty.

t917910 · 5 months ago

Perhaps you have already paid off your mortgage and saved up a million dollars for retirement? And you're not threatened by dismissal or salary reduction because supposedly "AI will replace everyone."

By the way, you don't need to be a 50+ year old nerd. Nerds are a special culture-pen where smart straight-A students from schools are placed so they can work, increase stakeholder revenues, and not even accidentally be able to do anything truly worthwhile that could redistribute wealth in society.

Deleted Comment

forgetfreeman · 5 months ago

> And people just sit around, unimpressed, and complain that ... what ... it isn't a perfect superintelligence that understands everything perfectly

More like we note the frequency with which these tools produce shallow bordering on useless responses, note the frequency with which they produce outright bullshit, and conclude their output should not be taken seriously. This smells like the fervor around ELIZA, but with several multinational marketing campaigns behind it pushing.

wormlord · 5 months ago

Probably because the applications of the technology are things like automating calls for collections agencies:

https://www.ycombinator.com/companies/domu-technology-inc/jo...

If we judge a technology by how it transforms our lives, LLMs and GenAI has mostly been a net negative (at least that is how it feels).

jhbadger · 5 months ago

Yeah, like I. I. Rabi said in regard to people no longer being amazed by the achievements of physics, "What more do you want, mermaids?"

Anyone who remembers further back than a decade or so remembers when the height of AI research was chess programs that could beat grandmasters. Yes, LLMs aren't C3PO or the like, but they are certainly more like that than anything we could imagine just a few years ago.

Lendal · 5 months ago

The speed at which anything progresses is impressive if you're not paying attention while other people toil away on it for decades, until one day you finally look up and say, "Wow, the speed at which this thing progressed is insane!"

I remember seeing an AI lab in the late 1980's and thinking "that's never going to work" but here we are, 40 years later. It's finally working.

genevra · 5 months ago

Absolutely, as someone with far less experience than you I feel quite spoiled I'll have this for help coding for the rest of time.

Deleted Comment

vendiddy · 5 months ago

I'm glad I'm not the only person in awe with LLMs. It feels like it came straight out of science fiction novel. What does it take to impress people nowadays?

I feel like if teleportation was invented tomorrow, people would complain that it can't transport large objects so it's useless.

oaeirjtlaierjt · 5 months ago

I often ask "So you say LLMs are worthless because you can't blindly trust the first thing they say? Do you blindly trust the first google search result? Do you trust every single thing your family members tell you?" It reminds me of my high school teachers saying Wikipedia can't be trusted.

causality0 · 5 months ago

Yeah the amount of "piffle work" that LLMs save me is astounding. Sure, I can look up fifty different numbers and copy them into excel. Or I can just tell an LLM "make a chart comparing measurements XYZ across devices ABC" and I'm looking at the info right there.

oulipo · 5 months ago

Probably because you don't have the same use-case as them... doing "code" is an "easy" use-case, but pondering on a humanities subject is much harder... you cannot "learn the structure" of humanities, you have to know the facts... and LLMs are bad at that

rcarmo · 5 months ago

The real problem is that they are fundamentally unreliable once you start looking closer at their answers.

ehutch79 · 5 months ago

Because we're being told it is a perfect super intellegence, that it is going to replace senior engineers. The hype cycle is real, and worse than blockchain ever was. I'm sure llms will be able to code a full enterprise app about the same time moon coin replaces $usd.

gxs · 5 months ago

I wholeheartedly agree with you and it’s funny reading the replies to your comment.

Basically people just doubling down on everything you just described. I can’t quite put a finger on it but it has a tinge of insecurity or something like that, hope that’s not the case and me just misinterpreting

kolektiv · 5 months ago

Because they're not perfect, and they're going to lobotomise societal ability to invent if we're not actually careful with them.

https://news.ycombinator.com/item?id=43504459

eviks · 5 months ago

Of course you'd be confused if your transform a list of basic fails people complain about into

> And people are like, "Wah, it can't write code like a Senior engineer with 20 years of experience!"

But LLMs should be good enough to resolve this confusion, ask them!

Terr_ · 5 months ago

It's like computer graphics and VR: Amazing advances over the years, very impressive, fun, cool, and by no means a temporary fad...

... But I do not believe we're on the cusp of a Lawnmower-Man future where someone's Metaverse eats all the retail-conference-halls and movie-theaters and retail-stores across the entire globe in an unbridled orgy of mind-shattering investor returns.

Similarly, LLMs are neat and have some sane uses, but the fervor about how we're about to invent the Omnimind and usher in the singularity and take over the (economic) world? Nah.

gilbetron · 5 months ago

Ah yes, the "computer graphics, which has generated billions upon billions of revenue and change are just a fad" argument.

What next, "This Internet thing was just a fad" or "The industrial age was a fad"?

regularjack · 5 months ago

There are legitimate concerns to have in regards to LLMs, it's not all a sea of roses.

gilbetron · 5 months ago

As far as capabilities? Meh. It will improve, or not, but we'll figure out really cool things regardless.

As far as breaking our reality and society? Absolutely :(

belter · 5 months ago

You did not mention any useful business case, with real economic impact, either than a very smart template generator for Developers.

Choose a very narrow domain, that you known well, and you quickly realize they are just repeating the training data.

deostroll · 5 months ago

Today's models are far from autonomous thinking machines. It is a cognitive bias among the masses that agree. It is just a giant calculator. It predicts "the most probable next word" from a sea of all combinations of next words.

jolt42 · 5 months ago

I don't see it as a bigger leap than the internet itself. I recall needing books on my desk or a road trip to the local bookshop to find out coding answers. Stack Overflow beats AI most days, but the LLMs are another nice tool.

eptcyka · 5 months ago

For exploring topics in a shallow fashion is fine with LLMs, doing anything deep is just too unreliable due to hallucination. All models I’ve talked to desperately want to give a positive answer, and thus will often just lie.

mgh2 · 5 months ago

Could it be that people who are too bullish are being deceived by smoke and mirrors?

abpavel · 5 months ago

Indeed, it is the stuff of science fiction, and the you get an "akshually, it's just statistics" comment. I feel people projecting their fears, because deep down, they're simply afraid.

elorant · 5 months ago

I like LLMs for what they are. Classifiers. I don’t trust them as search engines because of hallucinations. I use them to get a bearing on a subject but then I’ll turn to Google to do the real research.

tracerbulletx · 5 months ago

It is absurd. The research and learning power is currently right now a miracle.

fumufumufumu · 5 months ago

I go back and forth. I share your amazement. I used Gemini Deep Research the other day and was blown away. It claimed to go read 20 websites, I showed its "thinking" and steps. Its conclusions at each step. Then it wrote a large summary (several pages)

On the other hand, I saw github recently added Copilot as a code reviewer. For fun I let it review my latest pull request. I hated its suggestions but could imagine a not too distant future where I'm required by upper management to satisfy the LLM before I'm allowed to commit. Similarly, I've asked ChatGPT questions and it's been programmed to only give answers that Silicon Valley workers have declared "correct".

The thing I always find frustrating about the naysayers is that they seem to think how it works today is the end if it. Like I recently listened to an episode of Econtalk interviewing someone on AI and education. See lives in the UK and used Tesla FSD as an example of how bad AI is. Yet I live in California and see Waymo mostly working today and lots of people using it. I believe she wouldn't have used the Tesla FSD example, and would possibly have changed her world view at least a little, if she'd updated on seeing self driving work.

conartist6 · 5 months ago

What you're impressed with is 40% human skill in creating an LLM, 0.5% value created by the model. And 59.5% the skills of all the people it ate and is now trying to destroy the livelihood of

hansmayer · 5 months ago

As others have pointed out already, the hype about writing code like senior engineer, or in general acting as a competent assistant is what created the expectation in the first place. They keep over-promising, but underdelivering. Who is the guy talking about AGI most of the time? Could it be the top-executive of one of the largest gen AI companies, do you think? I won't deny it has occasionally a certain 'star-trek-computer' flair to it, but most of the time it feels like having a heavily degraded version of "rain man". He may count your cards perfectly one moment, then will get stuck trying to untie his shoes. I stopped counting how many times it produced just outright wrong outputs, to the point of suggesting literally the opposite of what one is asking of them. I would not mind it so much, if they were being advertised for what they are, not for what they could potentially be, if only another half a trillion dollar were invested in data-centers. It is not going to happen with this technology, the issue is structural, not resource-related.

Lich · 5 months ago

Really? I just get garbage. Both Claude and CoPilot kept insisting that it was ok to use react hooks outside of function components. There have been many other situations where it gave me some code and even after refining the prompt it just gave me wrong or non working code. I’m not expecting perfection, but at least don’t waste my time with hallucinations or just flat out stuff that doesn’t work.

rglover · 5 months ago

> And people are like, "Wah, it can't write code like a Senior engineer with 20 years of experience!"

Except this isn't true. The code quality varies dramatically depending on what you're doing, the length of the chat/context, etc. It's an incredible productivity booster, but even earlier today, I wasted time debugging hallucinated code because the LLM mixed up methods in a library.

The problem isn't so much that it's not an amazing technology, it's how it's being sold. The people who stand to benefit are speaking as though they've invented a god and are scaring the crap out of people making them think everyone will be techno-serfs in a few years. That's incredibly careless, especially when as a technical person, you understand how the underlying system works and know, definitively, that these things aren't "intelligent" the way they're being sold.

Like the startups of the 2010s, everyone is rushing, lying, and huffing hopium deluding themselves that we're minutes away from the singularity.

melagonster · 5 months ago

You forget the large group of people that proudly declare they invent AGI and they can make everyone lose jobs and starve. complains are for them, not for you.

no_wizard · 5 months ago

Keep in mind it understands nothing. The notion that LLMs understand anything is fundamentally flawed, as they do not demonstrate any markers of understanding

Aeolun · 5 months ago

> Wah, it can't write code like a Senior engineer with 20 years of experience!

Thank goodness for that too. I want it to help me with my job, not replace me.

ThrowawayTestr · 5 months ago

The fact that people call them Markov chains when they clearly haven't used the early chat bots that were dumb Markov chains pisses me off.

Jensson · 5 months ago

The fact that you don't know what Markov chain means and get angry over others over that pisses me off.

Both are Markov chains, that you used to erroneously think Markov chain is a way to make a chatbot rather than a general mathematical process is on you not them.

otabdeveloper4 · 5 months ago

> Just amazing, doing things we dreamed about for decades.

Chatbots like in the sci-fi of your nostalgia? I never dreamed about that shit, sorry.

vivzkestrel · 5 months ago

not one of them have managed to generate a successful promise based implementation of recaptcha v2 in javascript from scratch https://developers.google.com/recaptcha/docs/loading they have a million+ references for this

billy99k · 5 months ago

I have it help me with reporting (with private info taken out of course). I've easily saved myself hundreds of hours already.

alxjrvs · 5 months ago

Because the marketers oversold it. That is why you are seeing a pushback. I also outright rejected them because 1) they were sold and marketed as end all be all replacements for human thought, and 2) they promised to replace only the parts of my job that I enjoy. Billboards were up in San Francisco telling my "bosses" that I was soon to be replaced, and the loudest and earliest voices told me that the craft I love is dead. Imagine Nascar drivers excitedly discussing how cool it was they wouldn't have to turn left anymore - made me wonder why everyone else was here.

It was, more or less, the same narrative arc as Bitcoin, and was (is) headed for a crash.

That said, I've spent a few weeks with augment, and it is revelatory, certainly. All the marketing - aimed at a suite I have no interest in - managed to convince me it was something it wasn't. It isn't a replacement, any more than a power drill is a replacement for a carpenter.

What it is, is very helpful. "The world's most fully functioning scaffolding script", an upgrade from copilot's "the world's most fully functioning tab-completer". I appreciate it usefulness as a force multiplier, but I am already finding corners and places where I'd just prefer to do it myself. And this is before we get into the craft of it all - I am not excited by the pitch "worse code, faster", but the utility is undeniable in this capitalistic hell planet, and I'm not a huge fan of writing SQL queries anyway, so here we are!

andrewstuart · 5 months ago

These are the people who give themselves self value by saying that amazing things have no value.

Maybe Freud could explain.

redeeman · 5 months ago

> it isn't a perfect superintelligence that understands everything perfectly?

it isnt ANY form of intelligence.

fergie · 5 months ago

I feel like both standpoints are true. Yes the tech is miraculous, and yes it has a very long way to go.

mablopoule · 5 months ago

For me, LLMs are a bit like if you were shown a talking dog with the education and knowledge of a first grad student: a talking dog is amazing in itself, and a truly impressive technical feat, that said you wouldn't make the dog file your taxes or represent you in court.

To quote Joel Spolsky, "When you’re working on a really, really good team with great programmers, everybody else’s code, frankly, is bug-infested garbage, and nobody else knows how to ship on time.", and that's the state we end up if we believe in the hype and use LLMs willy-nilly.

That's why people are annoyed, not because LLMs cannot code like a senior engineer, but because lots of content marketing a company valuation is dependent on making people believe it's the case.

r00sty · 5 months ago

I mean. How would you feel if you coded a menu in Python with certain choices but when you used it the choices were never the same or in the same order, sometimes there were fake choices, sometimes they are improperly labelled and sometimes the menu just completely fails to open. And you as a coder and you as a user have absolutely no control over any of those issues. Then, when you go online to complain people say useful stuff like "Isn't it amazing that it does anything at all!? Give us a break, we're working on it bro."

That's how I see LLMs and the hype surrounding them.

yungporko · 5 months ago

a lot of it is just plain denial. a certain subgenre of person will forever attack everything AI does because they feel threatened by it and a certain other subgenre of person will just copy this behaviour and parrot opinions for upvotes/likes/retweets.

spaceman_2020 · 5 months ago

Same

And people keep forgetting how new this stuff is

This is like trashing video games in 1980 because Pong has awful graphics

off_by_inf · 5 months ago

No, it can't write code like a Junior Zig engineer with 1 year of experience.

selfhoster · 5 months ago

"It talks to me like a person."

No, it provides responses. It does not talk.

ukuina · 5 months ago

ChatGPT Advanced Voice Mode?

NetOpWibby · 5 months ago

I'll keep bringing up this example whenever people dismiss LLMs.

I can ask Claude the most inane programming question and got an answer. If I were to do that on StackOverflow, I'd get downvoted, rude comments, and my question closed for being off-topic. I don't have to be super knowledgeable about the thing I'm asking about with Claude (or any LLM for that matter).

Even if you ignore the rudeness and elitism of power-users of certain platforms, there's no more waiting for someone to respond to your esoteric questions. Even if the LLM spews bullshit, you can ask it clarifying questions or rephrase until you see something that makes sense.

I love LLMs, I don't care what people say. Even when I'm just spitballing ideas[1], the output is great.

---

[1]: https://blog.webb.page/2025-03-27-spitball-with-claude.txt

nijave · 5 months ago

For me, I think they're valuable but also overhyped. They're not at the point they're replacing entire dev teams like some articles point out. In addition, they are amazingly accurate sometimes and amazingly misleading other times. I've noticed some ardent advocates ignore the latter.

It's incredibly frustrating when people think they're a miracle tool and blindly copy/paste output without doing any kind of verification. This is especially frustrating when someone who's supposed to be a professional in the field is doing it (copy lasting non working AI generated code and putting it up for review)

That said, on one hand, they multiply productive and useful information. On the other hand, they kill productive and spread misinformation. That said, I still seem them as useful but not a miracle

ivanjermakov · 5 months ago

I blame overpromised expectations from startups and public companies, screaming about AGI and superintelligence.

Truly amazing technology which is very good at generating and correcting texts is marketed as senior developer, talented artist, and black box that has solution to all your problems. This impression shatters on the first blatant mistake, e.g. counting elephant legs: https://news.ycombinator.com/item?id=38766512

fumeux_fume · 5 months ago

Wicked, selfish children!Only the real power users understand it generates really cool images and talks just like a person.

j-krieger · 5 months ago

It's the classic HN-like anti-anything bubble we see with Javascript frameworks. Hundreds of thousands of people are productive with them and enjoy them. They created entire industries and job fields. The same is happening with LLMs, but the usual counter-culture dev crowd is denying it while it's happening right before their eyes. I too use LLMs every day. I never click and a link and it doesn't exist. When I want to take my mind off of things, I just talk with GPT.

MindBeams · 5 months ago

You're being disingenuous. The tweet was talking about asserting the existence of fake articles, claiming that a paper was written in one year while summarizing a paper that explicitly says it was written in another, and severe hallucinations. Nowhere does she even imply that she's looking for superintelligence.

scyzoryk_xyz · 5 months ago

Obligatory Louis CK everything is amazing and no one is happy bit goes here:

https://youtu.be/aGnMbKwP36U?si=WbXzphhhP8Hak1OQ

It’s a human nature thing - we’re supposed to be collecting nuts in the forest.

imtheopposite · 5 months ago

What I find interesting is that my experience has been 100% the opposite. I’ve been using ChatGPT, Claude, and Gemini for almost a year (well only the ChatGPT for a year since the rest are more recent.) I’ve been using them to help build circuits and write code. They are almost always wrong with circuit design, and create code that doesn’t work north of 80% of the time. My patience has dropped off to the point where I only experiment with LLM a few times a week because they are so bad. Yes it is miraculous that we can have a conversation, but it means nothing if the output is always wrong.

But I will admit the dora muckbang feet shit is fucking insane. And that just flat out scares the pants off me.

vonneumannstan · 5 months ago

>They are almost always wrong with circuit design, and create code that doesn’t work north of 80% of the time.

Sorry but this is a total skill issue lol. 80% code failure rate is just total nonsense. I don't think 1% of the code I've gotten from LLMs has failed to execute correctly.

emsign · 5 months ago

LLMs can't be trusted. They aé like an overconfident idiot who is pretending quite impressively, butif you check on the result it's just a bit too much bullshit in the result. So there's practically zero gain in using LLMs except WHEN you actually need a text that's nice and eloquent bullshit.

Almost everytime I've tried using LLMs I've fallen into thepattern on calling out, correcting and argueing with the LLMs which is of course in itself sillyto do, because they don't learn, they don't really "get it" when they are wrong. There's no benefit to talking to a human.

intended · 5 months ago

This is the place where tech shiny meets actual use cases, and users aren’t really good at articulating their problems.

Its also a slow burn issue - you have to use it for a while for what is obvious to users, to become obvious to people who are tech first.

The primary issue is the hype and forecasted capabilities vs actual use cases. People want something they can trust as much as an authority, not as much as a consultant.

If I were to put it in a single sentence? These are primarily narrative tools, being sold as factual /scientific tools.

When this is pointed out, the conversation often shifts to “well people aren’t that great either”. This takes us back to how these tools are positioned and sold. They are being touted as replacements to people in the future. When this claim is pressed, we get to the start of this conversation.

Frankly, people on HN aren’t pessimistic enough about what is coming down the pipe. I’ve started looking at how to work in 0 Truth scenarios, not even 0 trust. This is a view held by everyone I have spoken to in fraud, misinformation, online safety.

There’s a recent paper which showed that GAI tools improved the profitability of Phishing attempts by something like 50x in some categories, and made previously loss making (in $/hour terms) targets, profitable. Schneier was one of the authors.

A few days ago I found out someone I know who works in finance, had been deepfaked and their voice/image used to hawk stock tips. People were coming to their office to sue them.

I love tech, but this is the dystopia part of cyberpunk being built. These are narrative tools, good enough to make people think they are experts..

rzwitserloot · 5 months ago

The thing LLMs are really really good at, is sounding authoritative.

If you ask it random things the output looks amazing, yes. At least at first glance. That's what they do. It's indeed magical, a true marvel that should make you go: Woooow, this is amazing tech: Coming across as convincing, even if based on hallucinations, is in itself a neat trick!

But is it actually useful? The things they come up with are untrustworthy and on the whole far less good than previously available systems. In many ways, insidiously worse: It's much harder to identify bad information than it was before.

It's almost like we designed a system to pass turing tests with flying colours but forgetting that usefulness is what we actually wanted, not authoritative, human sounding bullshit.

I don't think the LLM naysayers are 'unimpressed', or that they demand perfection. I think they are trying to make statements aimed at balancing things:

Both the LLMs themselves, and the humans parroting the hype, are severely overstating the quality of what such systems produce. Hence, and this is a natural phenomenon you can observe in all walks of life, the more skeptical folks tend to swing the pendulum the other way, and thus it may come across to you as them being overly skeptical instead.

seertaak · 5 months ago

I totally agree, and this community far is from the worst. In trans communities there's incredible hostility towards LLMs - even local ones. "You're ripping off artists", "A pissing contest for tech bros", etc.

I'm trans, and I don't disagree that this technology has aspects that are problematic. But for me at least, LLMs have been a massive equalizer in the context of a highly contentious divorce where the reality is that my lawyer will not move a finger to defend me. And he's lawyer #5 - the others were some combination of worse, less empathetic, and more expensive. I have to follow up a query several times to get a minimally helpful answer - it feels like constant friction.

ChatGPT was a total game-changer for me. I told it my ex was using our children to create pressure - feeding it snippets of chat transcripts. ChatGPT suggested this might be indicative of coercive control abuse. It sounded very relevant (my ex even admitted in a rare, candid moment that she feels a need to control everyone around her one time), so I googled the term - essentially all the components were there except physical violence (with two notable exceptions).

Once I figured that out, I asked it to tell me about laws related to controlling relationships - and it suggested laws either directly addressing (in the UK and Australia), and the closest laws in Germany (Nötigung, Nachstellung, violations of dignity, etc., translating them to English - my best language). Once you name specific laws broken and provide a rationale for why there's a Tatbestand (ie the criterion for a violation is fulfilled), your lawyer has no option but to take you more seriously. Otherwise he could face a malpractice suit.

Sadly, even after naming specific law violations and pointing to email and chat evidence, my lawyer persists in dragging his feet - so much so that the last legal letter he sent wasn't drafted by him - it was ChatGPT. I told my lawyer: read, correct, and send to X. All he did was to delete a paragraph and alter one or two words. And the letter worked.

Without ChatGPT, I would be even more helpless and screwed than I am. It's far from clear I will get justice in a German court, but at least ChatGPT gives me hope, a legal strategy. Lastly - and this is a godsend for a victim of coercive control - it doesn't degrade you. Lawyers do. It completely changed the dynamics of my divorce (4 years - still no end in sight, lost my custody rights, then visitation rights, was subjected to confrontational and gaslighting tactics by around a dozen social workers - my ex is a social worker -, and then I literally lost my hair: telogen effluvium, tinea capitis, alopecia areata... if it's stress-related, I've had it), it gave me confidence when confronting my father and brother about their family violence.

It's been the ONLY reliable help, frankly, so much so I'm crying as I write this. For minorities that face discrimination, ChatGPT is literally a lifeline - and that's more true the more vulnerable you are.

Dead Comment

jxjnskkzxxhx · 5 months ago

I agree. I recently asked if a certain GPU would fit in a certain computer... And it understood that fit could mean physically inside by could also mean that the interface is compatible, and answered both.

WhY aRe PeOpLe BuLlIsH

oytis · 5 months ago

Did it answer correctly though?

Dead Comment

jandrewrogers · 5 months ago

TBH, they produce trash results for almost any question I might want to ask them. This is consistently the case. I must use them differently than other people.

LLMs produce midwit answers. If you are an expert in your domain, the results are kind of what you would expect for someone who isn’t an expert. That is occasionally useful but if I wanted a mediocre solution in software I’d use the average library. No LLM I have ever used has delivered an expert answer in software. And that is where all the value is.

I worked in AI for a long time, I like the idea. But LLMs are seemingly incapable of replacing anything of value currently.

The elephant in the room is that there is no training data for the valuable skills. If you have to rely on training data to be useful, LLMs will be of limited use.

jasonjmcghee · 5 months ago

> No LLM I have ever used has delivered an expert answer...and that's where all the value is.

If this were true, no one would hire junior employees and assistants. There's a huge amount of work that requires more time than expertise.

deadbabe · 5 months ago

Here’s when we can start getting excited about LLMs: when they start making new and valid scientific discoveries that can radically change our world.

When an AI can say “Here’s how you make better, smaller, more powerful batteries, follow these plans”, then we will have a reason to worship AI.

When AI can bring us wonders like room temperature semiconductors, fast interstellar travel, anti-gravity tech, solutions to world hunger and energy consumption, then it will have fulfilled the promise of what AI could do for humanity.

Until then, LLMs are just fancy search and natural language processors. Puppets with strings. It’s about as impressive as Google was when it first came out.

umanwizard · 5 months ago

Google was fantastically impressive when it first came out.