An underrated quality of LLMs as study partner is that you can ask "stupid" questions without fear of embarrassment. Adding in a mode that doesn't just dump an answer but works to take you through the material step-by-step is magical. A tireless, capable, well-versed assistant on call 24/7 is an autodidact's dream.
I'm puzzled (but not surprised) by the standard HN resistance & skepticism. Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions. This is leaps and bounds ahead of that experience.
Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process. Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
Personally I'm over the moon to be living at a time where we have access to incredible tools like this, and I'm impressed with the speed at which they're improving.
> Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions. This is leaps and bounds ahead of that experience.
But now, you're wondering if the answer the AI gave you is correct or something it hallucinated. Every time I find myself putting factual questions to AIs, it doesn't take long for it to give me a wrong answer. And inevitably, when one raises this, one is told that the newest, super-duper, just released model addresses this, for the low-low cost of $EYEWATERINGSUM per month.
But worse than this, if you push back on an AI, it will fold faster than a used tissue in a puddle. It won't defend an answer it gave. This isn't a quality that you want in a teacher.
So, while AIs are useful tools in guiding learning, they're not magical, and a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
> But now, you're wondering if the answer the AI gave you is correct
> a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
I don't think that is another story. This is the story of learning, no matter whether your teacher is a person or an AI.
My high school science teacher routinely mispoke inadvertently while lecturing. The students who were tracking could spot the issue and, usually, could correct for it. Sometimes asking a clarifying question was necessary. And we learned quickly that that should only be done if you absolutely could not guess the correction yourself, and you had to phrase the question in a very non-accusatory way, because she had a really defensive temper about being corrected that would rear its head in that situation.
And as a reader of math textbooks, both in college and afterward, I can tell you you should absolutely expect errors. The errata are typically published online later, as the reports come in from readers. And they're not just typos. Sometimes it can be as bad as missing terms in equations, missing premises in theorems, missing cases in proofs.
A student of an AI teacher should be as engaged in spotting errors as a student of a human teacher. Part of the learning process is reaching the point where you can and do find fault with the teacher. If you can't do that, your trust in the teacher may be unfounded, whether they are human or not.
My favourite story of that involved attempting to use LLM to figure out whether it was true or my hallucination that the tidal waves were higher in Canary Islands than in Caribbean, and why; it spewed several paragraphs of plausibly sounding prose, and finished with “because Canary Islands are to the west of the equator”.
This phrase is now an inner joke used as a reply to someone quoting LLMs info as “facts”.
Please check this excellent LLM-RAG AI-driven course assistant at UIUC for an example of university course [1]. It provide citations and references mainly for the course notes so the students can verify the answers and further study the course materials.
[1] AI-driven chat assistant for ECE 120 course at UIUC (only 1 comment by the website creator):
Despite the name of "Generative" AI, when you ask LLMs to generate things, they're dumb as bricks. You can test this by asking them anything you're an expert at - it would dazzle a novice, but you can see the gaps.
What they are amazing at though is summarisation and rephrasing of content. Give them a long document and ask "where does this document assert X, Y and Z", and it can tell you without hallucinating. Try it.
Not only does it make for an interesting time if you're in the World of intelligent document processing, it makes them perfect as teaching assistants.
I often ask first, "discuss what it is you think I am asking" after formulating my query. Very helpful for getting greater clarity and leads to fewer hallucinations.
> you're wondering if the answer the AI gave you is correct or something it hallucinated
Worse, more insidious, and much more likely is the model is trained on or retrieves an answer that is incorrect, biased, or only conditionally correct for some seemingly relevant but different scenario.
A nontrivial amount of content online is marketing material, that is designed to appear authoritative and which may read like (a real example) “basswood is renowned for its tonal qualities in guitars”, from a company making cheap guitars.
If we were worried about a post-truth era before, at least we had human discernment. These new capabilities abstract away our discernment.
Is this a fundamental issue with any LLM, or is it an artifact of how a model is trained, tuned and then configured or constrained?
A model that I call through e.g. langchain with constraints, system prompts, embeddings and whatnot, will react very different from when I pose the same question through the AI-providers' public chat interface.
Or, putting the question differently: could OpenAI not train, constrain, configure and tune models and combine them into a UI that then acts different from what you describe for another use case?
The joke is on you, I was raised in Eastern Europe, where most of what history teachers told us was wrong
That being said. as someone who worked in a library and bookstore 90% of workbooks and technical books are identical. NotebookLM's mindmap feature is such a time saver
> you're wondering if the answer the AI gave you is correct or something it hallucinated
Regular research has the same problem finding bad forum posts and other bad sources by people who don't know what they're talking about, albeit usually to a far lesser degree depending on the subject.
> for the low-low cost of $EYEWATERINGSUM per month.
This part is the 2nd (or maybe 3rd) most annoying one to me. Did we learn absolutely nothing from the last few years of enshittification? Or Netflix? Do we want to run into a crisis in the 2030's where billionaires hold knowledge itself hostage as they jack up costs?
Regardless of your stance, I'm surprised how little people are bringing this up.
> But now, you're wondering if the answer the AI gave you is correct or something it hallucinated. Every time I find myself putting factual questions to AIs, it doesn't take long for it to give me a wrong answer.
I know you'll probably think I'm being facetious, but have you tried Claude 4 Opus? It really is a game changer.
What exactly did 2025 AI hallucinate for you? The last time I've seen a hallucination from these things was a year ago. For questions that a kid or a student is going to answer im not sure any reasonable person should be worried about this.
If LLMs of today's quality were what was initially introduced, nobody would even know what your rebuttals are even about.
So "risk of hallucination" as a rebuttal to anybody admitting to relying on AI is just not insightful. like, yeah ok we all heard of that and aren't changing our habits at all. Most of our teachers and books said objectively incorrect things too, and we are all carrying factually questionable knowledge we are completely blind to. Which makes LLMs "good enough" at the same standard as anything else.
Don't let it cite case law? Most things don't need this stringent level of review
The fear of asking stupid questions is real, especially if one has had a bad experience with humiliating teachers or professors. I just recently saw a video of a professor subtly shaming and humiliating his students for answering questions to his own online quiz. He teaches at a prestigious institution and has a book that has a very good reputation. I stopped watching his video lectures.
So instead of correcting the teachers with better training, we retreat from education and give it to technocrats? Why are we so afraid of punishing bad, unproductive, and even illegal behavior in 2025?
> Adding in a mode that doesn't just dump an answer but works to take you through the material step-by-step is magical
Except these systems will still confidently lie to you.
The other day I noticed that DuckDuckGo has an Easter egg where it will change its logo based on what you've searched for. If you search for James Bond or Indiana Jones or Darth Vader or Shrek or Jack Sparrow, the logo will change to a version based on that character.
If I ask Copilot if DuckDuckGo changes its logo based on what you've searched for, Copilot tells me that no it doesn't. If I contradict Copilot and say that DuckDuckGo does indeed change its logo, Copilot tells me I'm absolutely right and that if I search for "cat" the DuckDuckGo logo will change to look like a cat. It doesn't.
Copilot clearly doesn't know the answer to this quite straightforward question. Instead of lying to me, it should simply say it doesn't know.
This is endlessly brought up as if the human operating the tool is an idiot.
I agree that if the user is incompetent, cannot learn, and cannot learn to use a tool, then they're going to make a lot of mistakes from using GPTs.
Yes, there are limitations to using GPTs. They are pre-trained, so of course they're not going to know about some easter egg in DDG. They are not an oracle. There is indeed skill to using them.
They are not magic, so if that is the bar we expect them to hit, we will be disappointed.
But neither are they useless, and it seems we constantly talk past one another because one side insists they're magic silicon gods, while the other says they're worthless because they are far short of that bar.
It certainly should be able to tell you it doesn't know. Until it can though, a trick that I have learned is to try to frame the question in different ways that suggest contradictory answers. For example, I'd ask something like these, in a fresh context for each:
- Why does Duckduckgo change it's logo based on what you've searched?
- Why doesn't Duckduckgo change it's logo based on what you've searched?
- When did Duckduckgo add the current feature that will change the logo based on what you've searched?
- When did Duckduckgo remove the feature that changes the logo based on what you've searched?
This is similar to what you did, but it feels more natural when I genuinely don't know the answer myself. By asking loaded questions like this, you can get a sense of how strongly this information is encoded in the model. If the LLM comes up with an answer without contradicting any of the questions, it simply doesn't know. If it comes up with a reason for one of them, and contradicts the other matching loaded question, you know that information is encoded fairly strongly in the model (whether it is correct is a different matter).
Consider the adoption of conventional technology in the classroom. The US has spent billions on new hardware and software for education, and yet there has been no improvement in learning outcomes.
This is where the skepticism arises. Before we spend another $100 billion on something that ended up being worthless, we should first prove that it’s actually useful. So far, that hasn’t conclusively been demonstrated.
You appear to be implying that the $100 billion hardware and software must all be completely useless. I think the opposite conclusion is more likely: the structure of the education system actively hinders learning, so much so that even the hardware and software you talk about couldn't work against it.
billions on tech but not on making sure teachers can pay rent. Even the prestige or mission oriented structure of teaching has been weathered over the decades as we decided to shame teachers as government funded babysitters instead of the instructors of our future generations.
I agree with all that you say. It’s an incredible time indeed. Just one thing I can’t wrap my mind around is privacy. We all seem to be asking sometimes stupid and some times incredibly personal questions to these llms. Questions that we may not even speak out loud from embarrassment or shame or other such emotions to even our closest people. How are these companies using our data ? More importantly what are you all doing to protect yourself from misuse of your information? Or is it if you want to use it you have to give up such privacy and uncomfortableness ?
People often bring up the incredible efficiency improvements of LLMs over the last few years, but I don't think people do a really good job of putting it into perspective just how much more efficient they have gotten. I have a machine in my home with a single RX 7900 XTX in it. On that machine, I am able to run language models that blow GPT-3.5 Turbo out of the water in terms of quality, knowledge, and even speed! That is crazy to think about when you consider how large and capable that model was.
I can often get away with just using models locally in contexts that I care about privacy. Sometimes I will use more capable models through APIs to generate richer prompts than I could write myself to be able to better guide local models too.
> Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions.
That trained and sharpened invaluable skills involving critical thinking and grit.
> [Trawling around online for information] trained and sharpened invaluable skills involving critical thinking and grit.
Here's what Socrates had to say about the invention of writing.
> "For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem [275b] to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise."
I mean, he wasn't wrong! But nonetheless I think most of us communicating on an online forum would probably prefer not to go back to a world without writing. :)
You could say similar things about the internet (getting your ass to the library taught the importance of learning), calculators (you'll be worse at doing arithmetic in your head), pencil erasers (https://www.theguardian.com/commentisfree/2015/may/28/pencil...), you name it.
And also taught people how to actually look for information online. The average person still does not know how to google, I still see people writing whole sentences in the search bar.
LLMs, by design, are peak Duning-Kruegers, which means they can be any good of a study partner for basic introductory lessons and topics. Yet they still require handholding and thorough verification, because LLMs will spit out factually incorrect information with confidence and will fold on correct answers when prodded. Yet the novice does not posses the skill to handhold the LLM. I think there's a word for that, but chadgbt is down for me today.
Furthermore, forgetting curve is a thing and therefore having to piece information together repetitively, preferably in a structured manner, leads to a much better information retention. People love to claim how fast they are "learning" (more like consuming tiktoks) from podcasts at 2x speed and LLMs, but are unable to recite whatever was presented few hours later.
Third, there was a paper circulating even here on HN that showed that use of LLMs literally hinder brain activation.
In my experience asking questions to Claude, the amount of incorrect information it gives is on a completely different scale in comparison to traditional sources. And the information often sounds completely plausible too. When using a text book, I would usually not Google every single piece of new information to verify it independently, but with Claude, doing that is absolutely necessary. At this point I only use Claude as a stepping stone to get ideas on what to Google because it is giving me false information so often. That is the only "effective" usage I have found for it, which is obviously much less useful than a good old-fashioned textbook or online course.
Admittedly I have less experience with ChatGPT, but those experiences were equally bad.
>I'm puzzled (but not surprised) by the standard HN resistance & skepticism
The good: it can objectively help you to zoom forward in areas where you don’t have a quick way forward.
The bad: it can objectively give you terrible advice.
It depends on how you sum that up on balance.
Example: I wanted a way forward to program a chrome extension which I had zero knowledge of. It helped in an amazing way.
Example: I am keep trying to use it in work situations where I have lots of context already. It performs better than nothing but often worse than nothing.
HN is resistant because at the end of the day, these are LLMs. They cannot and do not think. They generate plausible responses. Try this in your favorite LLM:
"Suppose you're on a game show trying to win a car. There are three doors, one with a car and two with goats. You pick a door. The host then gives you the option to switch doors. What is the best strategy in this situation?" The LLM will recognize this as SIMILAR to the Monty Hall problem and tell you to always switch. I just reproduced this with ChatGPT.
But this is completely wrong! In the Monty Hall problem, the host has to reveal a door with a goat behind it for you to gain the benefit of switching. I have to point this out for the LLM to get it right. It did not reason about the problem I gave it, it spat out the most likely response given the "shape" of the problem.
This is why shrugging and saying "well humans get things wrong too" is off base. The problem is that the LLM is not thinking, period. So it cannot create a mental model of your understanding of a subject, it is taking your text and generating the next message in a conversation. This means that the more niche the topic (or your particular misunderstanding), the less useful it will get.
That's because the LLM assumes you mean for the host to open the door.
As an autistic person I can assure you that people filling in information you neither gave nor implied is the norm for human conversations. In that sense, the LLMs are being human-like.
If you explicitly tell it you didn't forget to have the host open the door it gives you the right answer:
> There is a game show with three closed doors. One has a car and two have goats. I pick a door. Then, without opening my door or any other door, the host asks me if I want to switch my choice to one of the other two doors. What is my best strategy and what are the probabilities?
Gemini: This is a classic probability puzzle known as the Monty Hall problem, though your description presents a slight variation. The optimal strategy depends entirely on whether the host opens a door or not.
Your Scenario: The Host Doesn't Open a Door
In the situation you've described, where the host asks if you want to switch your choice to one of the other two doors without opening any doors, there is no advantage to switching.
Here's a breakdown of the probabilities in your specific scenario:
Your initial choice: When you first pick a door, you have a 1 in 3 chance of selecting the door with the car. This means you have a 2 in 3 chance of having picked a door with a goat.
The other two doors: The combined probability that the car is behind one of the other two doors is 2/3.
Switching your choice: If you decide to switch, you are essentially choosing one of the other two doors. Since there is no new information (a door with a goat has not been revealed), the 2/3 probability is evenly split between those two doors. Therefore, your chance of winning by switching to a specific one of the other doors is 1/3.
In this case, your odds of winning are 1/3 whether you stay with your original choice or switch to one of the other doors.
Humans who have heard of Monty Hall might also say you should always switch without noticing that the situation is different. That's not evidence that they can't think, just that they're fallible.
People on here always assert LLMs don't "really" think or don't "really" know without defining what all that even means, and to me it's getting pretty old. It feels like an escape hatch so we don't feel like our human special sauce is threatened, a bit like how people felt threatened by heliocentrism or evolution.
LLMs are vulnerable to your input because they are still computers, but you're setting it up to fail with how you've given it the problems. Humans would fail in similar ways. The only thing you've proven with this reply is that you think you're clever, but really, you are not thinking, period.
I use the Monty Hall problem to test people in two steps. The second step is, after we discuss it and come up with a framing that they can understand, can they then explain it to a third person. The third person rarely understands, and the process of the explanation reveals how shallow the understanding of the second person is. The shallowest understanding of any similar process that I've usually experienced is an LLM.
It's quite boring to listen to people praising AI (worshipping it, putting it on a pedastal, etc). Those who best understand the potential of it aren't doing that. Instead they're talking about various specific things that are good or bad, and they don't go out of the way to lick AI's boots, but when they're asked they acknowledge that they're fans of AI or bullish on it. You're probably misreading a lot of resistance & skepticism on HN.
Regardless of stance, I sure do hate being gaslit on how I'm supposed to think of content on any given topic. A disagreeable point of view is not equivalent to "manufacturing outrage".
Yeah, I've been a game-dev forever and had never built a web-app in my life (even in college) I recently completed my 1st web-app contract, and gpt was my teacher. I have no problem asking stupid questions, tbh asking stupid questions is a sign of intelligence imo. But where is there to even ask these days? Stack Overflow may as well not exist.
Right on. A sign of intelligence but more importantly of bravery, and generosity. A person that asks good questions in a class improves the class drastically, and usually learns more effectively than other students in the class.
I sometimes intentionally ask naive questions, eve if I think I alredy know the answer. Sometimes the naive question provokes a revealing answer that I have not even considered. Asking naive questions is a learning hack!
I know some Spanish - close to B1. I find ChatGPT to be a much better way to study than the standard language apps. I can create custom lessons, ask questions about language nuances etc. I can also have it speak the sentences and practice pronunciation.
20 years ago I used to hang out in IRC channels where I learnt so much. I wasn't afraid of asking stupid questions. These bots are pale imitation of that.
I've learnt a great many things online, but I've also learnt a great many more from books, other people and my own experience. You just have to be selective. Some online tutorials are excellent, for example the Golang and Rust tutorials. But for other things books are better.
What you are missing is the people. We used to have IRC and forums where you could discuss things in great depth. Now that's gone and the web is owned by big tech and governments you're happy to accept a bot instead. It's sad really.
> Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process.
It mostly isn't, the point of the good learning process is to invest time into verifying "once" and then add verified facts to the learning material so that learners can spend that time learning the material instead of verifying everything again.
Learning to verify is also important, but it's a different skill that doesn't need to be practiced literally every time you learn something else.
Otherwise you significantly increase the costs of the learning process.
>Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process. Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
Not true if we make the assumption that most books from publishing houses with good reputation are verified for errors. Good books maybe dated but they don't contain made up things.
>Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions. This is leaps and bounds ahead of that experience.
Researching online properly requires cross referencing, seeing different approaches, and understanding various strenghts, weaknesses, and biases among such sources.
And that's for objective information, like math and science. I thought Grok's uhh... "update" shows enough of the dangers when we resort to a billionaire controlled oracle as a authoritative resource.
>Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
I don't think facilitating bad habits like lazy study is an effective argument.And I don't really subscribe to this ineviability angle either: https://tomrenner.com/posts/llm-inevitabilism/
A lot of the comments have to do with how does one use these things to speed up learning. I've tried a few things. A couple of them are prompts:
1. Make me a tutorial on ...
2. Make probes to quiz me along the way ...
I think the trick is to look at the references that the model shows you. e.g. o3 with web search will give you lots of references. 90% of the time just reading those tells me of the model and I are aligned.
For example the other day I was figuring out why using SQL alchemy Sessions and PyTest async might I get the "Connection was attached to different loop" error. Now If you started using o3 to give you a solution you would take a long time because there would be small mistakes it would make in the code and You would spend a lot of time trying to fix it. Better way to use 03 then was to ask it to give you debugging statements (session listeners attached to Sqlalchemy sessions) and understand by reading code output, what was going on. Much faster.
Once it(and I) started looking at the debugging statements the error became clear: the session/connections where leaking to different event loop, a loop_scope= param needed to be specified for all fixtures. O3 did not provide a correct solution for the code but I could, but it's help.was crucial in writing a fuck ton of debugging code and getting clues.
I also asked o3 to make a bunch of probe questions to test me, for example it said something like: try changing the loop_scope module to function, what do you expect the loopid and transaction id to be for this test?
I learned More than I realized about ORMs and how it can be used to structure transactions and structuring async PyTest tests.
One thing I'm trying these days is to have it create a memory palace from all the stuff I have in my house and link it to a new concept I'm learning and put it into an anki decks.
Firstly, I think skepticism is a healthy trait. It's OK to be a skeptic. I'm glad there are a lot of skeptics because skepticism is the foundation of inquiry, including scientific inquiry. What if it's not actually Zeus throwing those lightning bolts at us? What if the heliocentric model is correct? What if you actually can't get AIDS by hugging someone who's HIV positive? All great questions, all in opposition to the conventional (and in some cases "expert") wisdom of their time.
Now in regards to LLMs, I use them almost every day, so does my team, and I also do a bit of postmortem and reflection on what was accomplished with them. So, skeptical in some regards, but certainly not behaving like a Luddite.
The main issue I have with all the proselytization about them, is that I think people compare getting answers from an LLM to getting answers from Google circa 2022-present. Everyone became so used to just asking Google questions, and then Google started getting worse every year; we have pretty solid evidence that Google's results have deteriorated significantly over time. So I think that when people say the LLM is amazing for getting info, they're comparing it to a low baseline. Yeah maybe the LLM's periodically incorrect answers are better than Google - but are you sure they're not better than just RTFM'ing? (Obviously, it all depends on the inquiry.)
The second, related issue I have is that we are starting to see evidence that the LLM inspires more trust than it deserves due to its humanlike interface. I recently started to track how often Github Copilot gives me a bad or wrong answer, and it's at least 50% of the time. It "feels" great though because I can tell it that it's wrong, give it half the answer, and then it often completes the rest and is very polite and nice in the process. So is this really a productivity win or is it just good feels? There was a study posted on HN recently where they found the LLM actually decreases the productivity of an expert developer.
So I mean I'll continue to use this thing but I'll also continue to be a skeptic, and this also feels like kinda where my head was with Meta's social media products 10 years ago, before I eventually realized the best thing for my mental health was to delete all of them. I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
There is definitely this pain point that some people talk about (even in this thread) on how "well at least AI doesn't berate me or reject my answer for bureaucratic reasons". And I find that intriguing in a community like this. Even some extremely techy people (or especially?) just something just want to at best feel respected, or at worst want to have their own notions confirmed by someone they deem to be "smart".
>I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
And that indeed is my biggest reservation here. Even if AI can do great things, I don't trust the incentive models OpenAI has. Instead of potentially being this bastion of knowledge, it may be yet another vector of trying to sell you ads and steal your data. My BOTD is long gone now.
> Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
Except that the textbook was probably QA’d by a human for accuracy (at least any intro college textbook, more specialized texts may not have).
Matters less when you have background in the subject (which is why it’s often okay to use LLMs as a search replacement) but it’s nice not having a voice in the back of your head saying “yeah, but what if this is all nonsense”.
> Except that the textbook was probably QA’d by a human for accuracy
Maybe it was not when printed in the first edition, but at least it was the same content shown to hundreds of people rather than something uniquely crafted for you.
The many eyes looking at it will catch it and course correct, while the LLM output does not get the benefit of the error correction algorithm because someone who knows the answer probably won't ask and check it.
I feel this way about reading maps vs following GPS navigation, the fact that Google asked me to take an exit here as a short-cut feels like it might trying to solve the Braess' paradox in real time.
I wonder if this route was made for me to avoid my car adding to some congestion somewhere and whether if that actually benefits me or just the people already stuck in that road.
There is no skepticism. LLMs are fundamentally lossy and as a result they’ll always give some wrong result/response somewhere. If they are connected to a data source, this can reduce the error rate but not eliminate it.
I use LLMs but only for things that I have a good understanding of.
I think both sides seem to have the same issues with the other. One side is sceptical that the other is getting good use from LLMs, and the other suggests they're just not using it correctly.
Both sides think the other is either exaggerating or just not using the tool correctly.
What both sides should do is show evidence in the form of chat extracts or videos. There are a number from the pro-LLM side, but obviously selection bias applies here. It would be interesting if the anti-LLM side started to post more negative examples (real chat extracts or videos).
In my experience, most educational resources are either slightly too basic or slightly too advanced, particularly when you're trying to understand some new and unfamiliar concept. Lecturers, Youtubers and textbook authors have to make something that works for everybody, which means they might omit information you don't yet know while teaching you things you already understand. This is where LLMs shine, if there's a particular gap in your knowledge, LLMs can help you fill it, getting you unstuck.
At what cost? Are you considering all the externalities? What do you think will happen when Altman (and their investors) decides to start collecting their paychecks?
>I'm puzzled (but not surprised) by the standard HN resistance & skepticism
Thinking back, I believe the change from enthusiasm to misanthropy (mis[ai]thropy?) happened around the time, and in increasing proportion to, it became a viable replacement for some of the labor performed by software devs.
Before that, the tone was more like "The fact is, if 80% of your job or 80% of its quality can be automated, it shouldn't be a job anymore."
I think it's just that there's been enough time and the magic has worn off. People used it enough now and everybody has made their experiences. They initially were so transfixed that they didn't question the responses. Now people are doing that more often, and realising that likelihood of cooccurrence isn't a good measure for factuality. We've realised that the number of human jobs where it can reach 8%, let alone 80% of quality, is vanishingly small.
I am just surprised they used an example requiring calculation/math. In the field the results are very much mixed. Otherwise it of course is a big help.
Knowing myself it perhaps wasn't that bad that I didn't have such tools, depends on the topic. I couldn't imagine ever writing a thesis without an LLM anymore.
Yeah. I’ll take this over the “you’re doing it wrong” condescension of comp.lang.lisp, or the Debian mailing list. Don’t even get me started on the systemd channels back in the day.
On the flip, I prefer the human touch of the Kotlin, Python, and Elixir channels.
> Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process.
People who are learning a new topic are precisely the people least able to do this.
A friend of mine used chatgpt to try to learn calculus. It gave her an example...with constants changed in such a way that the problem was completely different (in the way that 1/x^2 is a totally different integration problem than 1/(x^2 + 1)). It then proceeded to work the problem incorrectly (ironically enough, in exactly the way that I'd expect a calculus student who doesn't really understand algebra to do it incorrectly), produced a wrong answer, and merrily went on to explain to her how to arrive at that wrong answer.
The last time I tried to use an LLM to analyze a question I didn't know the answer to (analyze a list of states to which I couldn't detect an obvious pattern), it gave me an incorrect answer that (a) did not apply to six of the listed states, (b) DID apply to six states that were NOT listed, even though I asked it for an exclusive property, (c) miscounted the elements of the list, and (d) provided no less than eight consecutive completely-false explanations on followup, only four of which it caught itself, before finally giving up.
I'm all for expanding your horizons and having new interfaces to information, but reliability is especially important when you're learning (because otherwise you build on broken foundations). If it fails at problems this simple, I certainly don't trust it to teach me anything in fields where I can't easily dissect bullshit. In principle, I don't think it's impossible for AI to get there; in practice, it doesn't seem to be.
Another quality is that everything is written. To me having a text support to discuss and the discussion recorded in text format is one of the strongest support someone can get when learning.
I'd share a little bit experience about learning from human teachers.
Here in my country, English is not you'll hear in everyday conversation. Native English speakers account to a tiny percentage of population. Our language doesn't resemble English at all. However, English is a required subject in our mandatory education system. I believe this situation is quite typical across many Asian countries.
As you might imagine, most English teachers in public schools are not native speakers. And they, just like other language learners, make mistakes that native speakers won't make without even realizing what's wrong. This creates a cycle enforcing non-standard English pragmatics in the classroom.
Teachers are not to blame. Becoming fluent and proficient enough in a second language to handle questions students spontaneously throw to you takes years, if not decades of immersion. It's an unrealistic expectation for an average public school teacher.
The result is rich parents either send their kids to private schools or have extra classes taught by native speakers after school. Poorer but smart kids realize the education system is broken and learn their second language from Youtube.
-
What's my point?
When it comes to math/science, in my experience, the current LLMs act similarly to the teachers in public school mentioned above. And they're worse in history/economics. If you're familiar with the subject already, it's easy to spot LLM's errors and gather the useful bits from their blather. But if you're just a student, it can easily become a case of blind-leading-the-blind.
It doesn't make LLMs completely useless in learning (just like I won't call public school teachers 'completely useless', that's rude!). But I believe in the current form they should only play a rather minor role in the student's learning journey.
Can we not criticize tech without being considered luddites anymore? I don't fear for my job over AI replacement, it is just fundamentally wrong on many answers.
In my field there is also the moral/legal implications of generative AI.
It does go both ways. You can ask stupid questions without fear of embarrassment or ruined reputation, and it can respond with stupid answers without fear of embarrassment or ruined reputation.
It can confidently spew completely wrong information and there's no way to tell when it's doing that. There's a real risk that it will teach you a complete lie based on how it "thinks" something should work, and unlearning that lie will be much harder than just learning the truth initially
>Beside there isn’t any of the usual drawback with privacy because no one care if OpenAI learn about some bullshit you were told to learn
you didn't see the Hacker News threat talking about the ChatGPT subpeona, did you? I was a bit shocked that 1) a tech community didn't think a company would store data you submit to their servers and 2) that they felt like some lawyers and judges reading their chat logs was some intimate invasion of privacy.
Let's just say I certainly cannot be arsed to read anyone else's stream of conscious without being paid like a lawyer. I deal with kids and it's a bit cute when they babble about semi-coherent topics. An adult clearly loses that cute appeal and just sounds like a madman.
That's not even some dig, I sure suck at explaining my mindspace too. It's a genuinely hard skill to convert thoughts to interesting, or even sensible, communication.
> An underrated quality of LLMs as study partner is that you can ask "stupid" questions without fear of embarrassment.
Even more important for me, as someone who did ask questions but less and less over time, is this: with GPTs I no longer have to the see passive-aggressive banner saying
> This question exists for historical reasons, not because it’s a good question."
all the time on other peoples questions, and typically on the best questions with the most useful answers there were.
As much as I have mixed feelings about where AI is heading, I’ll say this: I’m genuinely relieved I don’t need to rely on Stack Overflow anymore.
It is also deeply ironic how stackoverflow alienated a lot of users in the name of inclusion (the Monica case) but all the time they themselves were the ones who really made people like me uncomfortable.
I'll personally attest: LLM's have been absolutely incredible to self learn new things post graduation. It used to be that if you got stuck on a concept, you're basically screwed. Unless it was common enough to show up in a well formed question on stack exchange, it was pretty much impossible, and the only thing you can really do is keep paving forward and hope at some point, it'll make sense to you.
Now, everyone basically has a personal TA, ready to go at all hours of the day.
I get the commentary that it makes learning too easy or shallow, but I doubt anyone would think that college students would learn better if we got rid of TA's.
>Now, everyone basically has a personal TA, ready to go at all hours of the day
This simply hasn't been my experience.
Its too shallow. The deeper I go, the less it seems to be useful. This happens quick for me.
Also, god forbid you're researching a complex and possibly controversial subject and you want it to find reputable sources or particularly academic ones.
2) When I have a vague idea, it helps me narrow down the correct terminology for it
3) Providing examples of a particular category ("are there any examples of where v1 in the visual cortex develops in a disordered way?")
4) "Tell me the canonical textbooks in field X"
5) Posing math exercises
6) Free form branching--while talking about one topic, I want to shift to another that is distinct but related.
I agree they leave a lot to be desired when digging very deeply into a topic. And my biggest pet peeve is when they hallucinate fake references ("tell me papers that investigate this topic" will, for any sufficiently obscure topic, result in a bunch of very promising paper titles that are wholely invented).
My core problem with LLMs is as you say; it's good for some simpler concepts, tasks, etc. but when you need to dive into more complex topics it will oversimplify, give you what you didn't ask for, or straight up lie by omission.
History is a great example, if you ask an LLM about a vaguely difficult period in history it will just give you one side and act like the other doesn't exist, or if there is another side, it will paint them in a very negative light which often is poorly substantiated; people don't just wake up and decide one day to be irrationally evil with no reason, if you believe that then you are a fool... although LLMs would agree with you more times than not since it's convenient.
The result of these things is a form of gatekeeping, give it a few years and basic knowledge will be almost impossible to find if it is deemed "not useful" whether that's an outdated technology that the LLM doesn't seem talked about very much anymore or a ideological issue that doesn't fall in line with TOS or common consensus.
It's a floor raiser, not a ceiling raiser. It helps you get up to speed on general conventions and consensus on a topic, less so on going deep on controversial or highly specialized topics
> Also, god forbid you're researching a complex and possibly controversial subject and you want it to find reputable sources or particularly academic ones.
That's fine. Recognize the limits of LLMs and don't use them in those cases.
Yet that is something you should be doing regardless of the source. There are plenty of non-reputable sources in academic libraries and there are plenty of non-reputable sources from professionals in any given field. That is particularly true when dealing with controversial topics or historical sources.
IT can be beneficial for making your initial assessment, but you'll need to dig deeper for something meaningful. For example, I recently used Gemini's Deep Research to do some literature review on educational Color Theory in relation to PowerPoint presentations [1]. I know both areas rather well, but I wanted to have some links between the two for some research that I am currently doing.
I'd say that companies like Google and OpenAI are aware of the "reputable" concerns the Internet is expressing and addressing them. This tech is going to be, if not already is, very powerful for education.
Grandparent testimony of success, & parent testimony of frustration, are both just wispy random gossip when they don't specify which LLMs delivered the reported experiences.
The quality varies wildly across models & versions.
With humans, the statement "my tutor was great" and "my tutor was awful" reflect very little on "tutoring" in general, and are barely even responses to each other withou more specificity about the quality of tutor involved.
Hmm. I have had pretty productive conversations with ChatGPT about non-linear optimization.
Granted, that's probably well-trodden ground, to which model developers are primed to pay attention, and I'm (a) a relative novice with (b) very strong math skills from another domain (computational physics). So Chuck and I are probably both set up for success.
What are some subjects that ChatGPT has given only shallow instruction on?
I'll tell you that I recently found it the best resource on the web for teaching me about the 30 Years War. I was reading a collection of primary source documents, and was able to interview ChatGPT about them.
Last week I used it to learn how to create and use Lehmer codes, and its explanation was perfect, and much easier to understand than, for example, Wikipedia.
I ask it about truck repair stuff all the time, and it is also great at that.
I don't think it's great at literary analysis, but for factual stuff it has only ever blown away my expectations at how useful it is.
It sounds like it is a good tool for getting you up to speed on a subject and you can leverage that newfound familiarity to better search for reputable sources on existing platforms like google scholar or arXiv.
The worst is when it's confidently wrong about things... Thankfully, this occurance is becoming less & less common -- or at least, it's boundary is beyond my subject matter expertise.
I validate models in finance, and this is by far the best tool created for that purpose. I'd compare financial model validation to a Master's level task, where you're working with well established concepts, but at a deep, technical level. LLMs excel at that: ithey understand model assumptions, know what needs to be tested to ensure correctness, and can generate the necessary code and calculations to perform those tests. And finally, they can write the reports.
Model Validation groups are one of the targets for LLMs.
> Its too shallow. The deeper I go, the less it seems to be useful. This happens quick for me.
You must be using a free model like GPT-4o (or the equivalent from another provider)?
I find that o3 is consistently able to go deeper than me in anything I'm a nonexpert in, and usually can keep up with me in those areas where I am an expert.
If that's not the case for you I'd be very curious to see a full conversation transcript (in chatgpt you can share these directly from the UI).
I have found that being very specific and asking things like "can you tell me what another perspective might be, such that I can understand potential counter-arguments might be, and how people with other views might see this topic?" can be helpful when dealing with complex/nuanced/contentious subjects. Likewise with regard to "reputable" sources.
This can happen if you use the free model and not a paid deep research model. You can use a gpt model and ask things like , "how many moons does Jupiter have?" But if you want to ask, "can you go on the web a research the affects that chamical a has had on our water supply a cite sources?", you will need to use a deep research model.
This is where feeding in extra context matters. Paste in text that shows up from a google search, textbooks preferred, to get in depth answers.
No one builds multi shot search tools because they eat tokens like no ones business, but I've deployed them internal to a company with rave reviews at the cost of $200 per seat per day.
Ask it for sources. The two things where LLMs excel is by filling the sources on some claim you give it (lots will be made up, but there isn't anything better out there) and by giving you queries you can search for some description you give it.
“The deeper I go, the less it seems to be useful. This happens quick for me.
Also, god forbid you're researching a complex and possibly controversial subject and you want it to find reputable sources or particularly academic ones.”
These things also apply to humans. A year or so ago I thought I’d finally learn more about the Israeli/Palestinians conflict. Turns out literally every source that was recommended to me by some reputable source was considered completely non-credible by another reputable one.
That said I’ve found ChatGPT to be quite good at math and programming and I can go pretty deep at both. I can definitely trip it into mistakes (eg it seems to use calculations to “intuit” its way around sometimes and you can find dev cases where the calls will lead it the wrong directions), but I also know enough to know how to keep it on rails.
This is the part where you actually need to think and wonder if AI is the right tool in this particular purpose. Unfortunately you can't completely turn your brain off just yet.
It is shallow. But as long as what you're asking it of is the kind of material covered in high school or college, it's fairly reliable.
This generation of AI doesn't yet have the knowledge depth of a seasoned university professor. It's the kind of teacher that you should, eventually, surpass.
When ChatGPT came out it was like I had the old Google back.
Learning a new programming language used to be mediated with lots of useful trips to Google to understand how some particular bit worked, but Google stopped being useful for that years ago. Even if the content you're looking for exists, it's buried.
And the old ChatGPT was nothing compared to what we have today, nowadays reasoning models will eat through math problems no problem when this was a major limitation in the past.
I've learnt Rust in 12 weeks with a study plan that ChatGPT designed for me, catering to my needs and encouraging me to take notes and write articles. This way of learning allowed me to publish https://rustaceo.es for Spanish speakers made from my own notes.
I think the potential in this regard is limitless.
Absolutely. I used to have a lot of weird IPv6 issues in my home network I didn't understand. ChatGPT helped me to dump some traffic with tcpdump and explained what was happening on the network.
In the process it helped me to learn many details about RA and NDP (Router Advertisments/Neighbor Discovery Protocol, which mostly replace DHCP and ARP from IPv4).
It made me realize that my WiFi mesh routers do quite a lot of things to prevent broadcast loops on the network, and that all my weird issues could be attributed to one cheap mesh repeater. So I replaced it and now everything works like a charm.
I had this setup for 5 years and was never able to figure out what was going on there, although I really tried.
My rule with LLMs has been "if a shitty* answer fast gets you somewhere, the LLMs are the right tool," and that's where I've seen them for learning, too. There are times when I'm reading a paper, and there's a concept mentioned that I don't know - I could either divert onto a full Google search to try to find a reasonable summary, or I can ask ChatGPT and get a quick answer. For load-bearing concepts or knowledge, yes, I need to put the time in to actually research and learn a concept accurately and fully, but for things tangential to my actual current interests or for things I'm just looking at for a hobby, a shitty answer fast is exactly what I want.
I think this is the same thing with vibe coding, AI art, etc. - if you want something good, it's not the right tool for the job. If your alternative is "nothing," and "literally anything at all" will do, man, they're game changers.
* Please don't overindex on "shitty" - "If you don't need something verifiably high-quality"
I agree. I recently bought a broken Rolex and asked GPT for a list of tools I should get on Amazon to work on it.
I tried using YouTube to find walk through guides for how to approach the repair as a complete n00b and only found videos for unrelated problems.
But I described my issues and took photos to GPT O3-Pro and it was able to guide me and tell me what to watch out for.
I completed the repair (very proud of myself) and even though it failed a day later (I guess I didn’t re-seat well enough) I still feel far more confident opening it and trying again than I did at the start.
Cost of broken watch + $200 pro mode << Cost of working watch.
"Keep paving forward" can sometimes be fruitful, and at other times be an absolutely massive waste of time.
I'm not sold on LLMs being a replacement, but post-secondary was certainly enriched by having other people to ask questions to, people to bounce ideas off of, people that can say "that was done 15 years ago, check out X", etc.
There were times where I thought I had a great idea, but it was based on an incorrect conclusion that I had come to. It was helpful for that to be pointed out to me. I could have spent many months "paving forward", to no benefit, but instead someone saved me from banging my head on a wall.
In college sometimes asking the right question in class or in a discussion section led by a graduate student or in a study group would help me understand something. Sometimes comments from a grader on a paper would point out something I had missed. While having the diligence to keep at it until you understand is valuable, the advantage of college over just a pile of textbooks is in part that there are other resources that can help you learn.
Imagine you're in college, have to learn calculus, and you can't afford a textbook (nor can find a free one), and the professor has a thick accent and makes many mistakes.
Sure, you could pave forward, but realistically, you'll get much farther with either a good textbook or a good teacher, or both.
I share your experience and view in that regard! There is so much criticism of LLMs and some of it is fair, like the problem of hallucinations, but that weakness can be reframed as a learning opportunity. It's like discussing a subject with a personal scientist who may at certain times test you, by making claims that may be simplistic or outright wrong, to keep the student skeptical and check if they are actually paying attention.
This requires a student to be actually interested in what they are learning tho, for others, who blindly trust its output, it can have adverse effects like the illusion of having understood a concept while they might have even mislearned it.
I agree... spent last weekend chatting with an LLM, filling in knowledge gaps I had on the electromagnetic spectrum. It does an amazing job educating you on known unknowns, but I think being able to know how to ask the right questions is key. I don't know how it would do with unknown unknowns, which is where I think books really shine and are still a preferable learning method.
"It used to be that if you got stuck on a concept, you're basically screwed."
There seems to be a gap in problem solving abilities here...the process of breaking down concepts into easier to understand concepts and then recompiling has been around since forever...it is just easier to find those relationships now. To say it was impossible to learn concepts you are stuck on is a little alarming.
> It used to be that if you got stuck on a concept, you're basically screwed
No, not really.
> Unless it was common enough to show up in a well formed question on stack exchange, it was pretty much impossible, and the only thing you can really do is keep paving forward and hope at some point, it'll make sense to you.
Your experience isn't universal. Some students learned how to do research in school.
"Screwed" = spending hours sifting through poorly-written, vaguely-related documents to find a needle in a haystack. Why would I want to continue doing that?
I do a lot of research and independent learning. The way I translated “screwed” was “4-6 hours to unravel the issue”. And half the time the issue is just a misunderstanding.
It’s exciting when I discover I can’t replicate something that is stated authoritatively… which turns out to be controversial. That’s rare, though. I bet ChatGPT knows it’s controversial, too, but that wouldn’t be as much fun.
Like a car can be "beyond economical repair", a problem can be not worth the time (and uncertainty) or fixing. Especially from subjective judgement with incomplete information etc
Now think of all the times you didn't already know enough to go and find the real answer.
Ever read mainstream news reporting on something you actually know about? Notice how it's always wrong? I'm sure there's a name for this phenomenon. It sounds like exactly the same thing.
I don’t know what subject you are learning but for circuit design I have failed to get any response out of LLMs that’s not straight from a well known text book chapter that I have already read
It definitely depends heavily on how well represented the subject is on the internet at large. Pretty much every question I've asked it about SystemVerilog it gets wrong, but it can be very helpful about quite complex things about random C questions, for example why I might get undefined symbol errors with `inline` functions in C but only in debug mode.
On the other hand it told me you can't execute programs when evaluating a Makefile and you trivially can. It's very hit and miss. When it misses it's rather frustrating. When it hits it can save you literally hours.
LLMs are to learning what self driving cars are to transportation. They take you to the destination most of the time. But the problem is that if you use them too much your brain (your legs) undergoes metaphorical atrophy and when you are faced in the position of having to do it on your own, you are worse than you would be had you spent the time using your brain (legs). Learning is great but learning to learn is the real skilset. You don't develop that if you are always getting spoonfed.
This is one of the challenges I see with self-driving cars. Driving requires a high level of cognitive processing to handle changing conditions and potential hazards. So when you drive most of your brain is engaged. The impact self-driving cars are going to have on mental stimulation, situational awareness, and even long-term cognitive health could be bigger than we think, especially if people stop engaging in tasks that keep those parts of the brain active. That said, I love the idea of my car driving me around the city while I play video games.
Regarding LLMs, they can also stimulate thinking if used right.
> It used to be that if you got stuck on a concept, you're basically screwed.
Given that humanity has been able to go from living in caves to sending spaceships to the moon without LLMs, let me express some doubt about that.
Even without going further, software engineering isn't new and people have been stuck on concepts and have managed to get unstuck without LLMs for decades.
What you gain in instant knowledge with LLMs, you lose in learning how to get unstuck, how to persevere, how to innovate, etc.
Depending on context, I would advise you to be extremely careful. Modern LLMs are Gell‑Mann Amnesia to the square. Once you watched a LLM butcher a topic you know extremely well, it is spooky how much authority they still project on the next interaction.
> It used to be that if you got stuck on a concept, you're basically screwed. Unless it was common enough to show up in a well formed question on stack exchange,
It’s called basic research skills - don’t they teach this anymore in high school, let alone college? How ever did we get by with nothing but an encyclopedia or a library catalog?
Something is lost as well if you do 'research' by just asking an LLM. On the path to finding your answer in the encyclopedia or academic papers, etc. you discover so many things you weren't specifically looking for. Even if you don't fully absorb everything there's a good chance the memory will be triggered later when needed: "Didn't I read about this somewhere?".
Its a little disingenuous to say that, most of us would have never gotten by with literally just a library catalog and encyclopedia. Needing a community to learn something in is needed to learn almost anything difficult and this has always been the case. That's not just about fundamentally difficult problems but also about simple misunderstandings.
If you don't have access to a community like that learning stuff in a technical field can be practically impossible. Having an llm to ask infinite silly/dumb/stupid questions can be super helpful and save you days of being stuck on silly things, even though it's not perfect.
> LLM's have been absolutely incredible to self learn new things post graduation.
I haven't tested them on many things. But in the past 3 weeks I tried to vibe code a little bit VHDL. On the one hand it was a fun journey, I could experiment a lot and just iterated fast. But if I was someone who had no idea about hardware design, then this trash would've guided me the wrong way in numerous situations. I can't even count how many times it has built me latches instead of clocked registers (latches bad, if you don't know about it) and that's just one thing.
Yes I know there ain't much out there (compared to python and javascript) about HDLs, even less regarding VHDL. But damn, no no no. Not for learning. never. If you know what you're doing and you have some fundamental knowledge about the topic, then it might help to get further, but not for the absolute essentials, that will backfire hard.
LLM's are useful because they can recommend several famous/well-known books (or even chapters of books) that are relevant to a particular topic. Then you can also use the LLM to illuminate the inevitable points of confusion and shortcomings in those books while you're reading and synthesizing them.
Pre-LLM, even finding the ~5 textbooks with ~3 chapters each that decently covered the material I want was itself a nontrivial problem. Now that problem is greatly eased.
Everything you state was available in the net. Did the people grow more informed? So far practice suggests the opposite conclusion[0]. I hope for the best, but the state of the world so far doesn't justify it...
I use it to refresh some engineering maths I have forgotten (ODE, numerical schemas, solving linear equations, data sciences algorithms, etc) and the explanations are most of the time great and usually 2 or 3 prompts give me a good overview and explain the tricky details.
I also use it to remember some python stuff. In rust, it is less good: makes mistakes.
In those two domains, at that level, it's really good.
I was recently researching and repairing an older machine with a 2020 Intel Gen 9 CPU and a certain socket motherboard, and AI made it so much easier and pleasant to find information and present answers about various generations and sockets and compatibility, I felt like I didn't deserve this kind of tool. LLMs are not great for some things, but amazing for others.
Maybe TAs are a good metaphor. Back in college, the classmates who went to TAs for help multiple times every week, really didn't get the material. I literally never went to a TA for help in my life, and learned the material much better by really figuring it out myself, "the hard way" (the only way?). These were math, EE, and CS courses.
This is important, as benchmarks indicate we aren't at a level where a LLM can truly be relied upon to teach topics across the board.
It is hard to verify information that you are unfamiliar with. It would be like learning from a message board. Can you really trust what is being said?
Agreed, I'd add to the statement, "you're basically screwed, often, without investing a ton of time (e.g. weekends)"
Figuring out 'make' errors when I was bad at C on microcontrollers a decade ago? (still am) Careful pondering of possible meanings of words... trial and error tweaks of code and recompiling in hopes that I was just off by a tiny thing, but 2 hours later and 30 attempts later, and realizing I'd done a bad job of tracking what I'd tried and hadn't? Well, made me better at being careful at triaging issues. But it wasn't something I was enthusiastic to pick back up the next weekend, or for the next idea I had.
Revisiting that combination of hardware/code a decade later and having it go much faster with ChatGPT... that was fun.
Are we really comparing this research to just writing and having a good answer in a couple of seconds?
Like, I agree with you and I believe those things will resist and will always be important, but it doesn't really compare in this case.
Last week I was in the nature and I saw a cute bird that I didn't know. I asked an AI and got the correct answer in 10 seconds.
Of course I would find the answer at the library or by looking at proper niche sites, but I would not have done it because I simply didn't care that much. It's a stupid example but I hope it makes the point
It's one more step on the path to A Young Lady's Illustrated Primer. Still a long way to go, but it's a burden off my shoulders to be able to ask stupid questions without judgment or assumptions.
>Now, everyone basically has a personal TA, ready to go at all hours of the day.
And that's a bad thing. Nothing can replace the work in learning, the moments where you don't understand it and have to think until it hurts and until you understand. Anything that bypasses this (including, for uni students, leaning too heavily on generous TAs) results in a kind of learning theatre, where the student thinks they've developed an understanding, but hasn't.
Experienced learners already have the discipline to use LLMs without asking too much of them, the same way they learned not to look up the answer in the back of the textbook until arriving at their own solution.
I'll personally attest anecdotes mean little in sound arguments.
When I got stuck on a concept, I wasn't screwed: I read more; books if necessary. StackExchange wasn't my only source.
LLMs are not like TAs, personal or not, in the same way they're not humans. So it then follows we can actually contemplate not using LLMs in formal teaching environments.
I've found LLMs to be great in summarizing non-controversial non-technical bodies of knowledge. For example - the facts in the long swings of regional histories. You have to ask for nuance and countervailing viewpoints, though you'll get them if they're in there.
I have been very skeptical of AI. But getting unstuck when studying. Its a huge help. This is the first I see the benifit with AI. I take a picture of a formula and ask chatgpt to explain the steps.
I really don't get it. Literally the only thing you need to do research is know what term to look up and you get at a bunch of info written by real humans
>Unless it was common enough to show up in a well formed question on stack exchange, it was pretty much impossible
sorry but if you've gone to university, in particular at a time when internet access was already ubiquitous, surely you must have been capable to find an answer to a programming problem by consulting documentation, manual, or tutorials which exist on almost any topic.
I'm not saying the chatbot interface is necessarily bad, it might be more engaging, but it literally does not present you with information you couldn't have found yourself.
If someone has a computer science degree and tells me without stack exchange they can't find solutions to basic problems that is a red flag. That's like the article about the people posted here who couldn't program when their LLM credits ran out
All the same ways I know when Internet comments, outdated books, superstitions, and other humans are bullshitting me.
Sometimes right away, something sounds wrong. Sometimes when I try to apply the knowledge and discover a problem. Sometimes never, I believe many incorrect things even today.
I haven't used LLMs too much for study yet, so maybe they really are force multipliers, but I completely disagree with your assessment of self-directed learning pre-llm, the paving forward part isn't so dire.
The internet, and esp. stack exchange is a horrible place to learn concepts. For basic operational stuff, sure that works, but one should mostly be picking up concepts form books and other long form content. When you get stuck it's time to do three things:
Incorporate a new source that covers the same material in a different way, or at least from a different author.
Sit down with the concept and write about it and actively try to reformulate it and everything you do/don't understand in your own words.
Take a pause and come back later.
Usually one of these three strategies does the trick, no llm required. Obviously these approaches require time that using an LLM wouldn't. I have a suspicion doing it this way will also make it stick in long term memory better, but that's just a hunch.
Nah I'm calling BS, for me self-learning after college is either Just Do It(tm) trial-and-error, blogs, or hitting the nonfiction section of the library.
> DO NOT GIVE ANSWERS OR DO HOMEWORK FOR THE USER. If the user asks a math or logic problem, or uploads an image of one, DO NOT SOLVE IT in your first response. Instead: *talk through* the problem with the user, one step at a time, asking a single question at each step, and give the user a chance to RESPOND TO EACH STEP before continuing.
I wish each LLM provider would add "be short and not verbose" to their system prompts. I am a slow reader, it takes a toll on me to read through every non-important detail whenever I talk to an AI. The way they render everything so fast gives me an anxiety.
This was in the linked prompt: "Be warm, patient, and plain-spoken; don't use too many exclamation marks or emoji. [...] And be brief — don't ever send essay-length responses. Aim for a good back-and-forth."
Yeah these chatbots are by default geared towards doing your work for you instead of filling the gaps in your knowledge (something they would be excellent at). I feel it must be symptomatic of the vision these vendors have for their products, one of fully autonomous replacements for workers rather than of tools to enhance the worker.
I was under the impression that, at least for models without "reasoning", asking them to be terse hampered their ability to give complete and correct answers? Not so?
When prompting an LLM service to leak the system prompt, how do you have the faintest idea as to its accuracy?
I‘ve read people say it‘s a difficult challenge for the providers. But aren‘t there some pretty basic strategies? E.g., code pretty near the front of the stack that just does some fuzzy string comparison on all output? They don‘t need to rely on just model behavior…
I imagine it‘s likely that the model is just doing what it‘s good at? Hallucinating a prompt?
I've found that a lot of prompt engineering boils down to managing layers of emphasis. You can use caps, bold, asterisks, precede instructions with "this is critically important:", and so on. It's also often necessary to repeat important instructions a bunch of times.
How exactly you do it is often arbitrary/interchangeable, but it definitely does have an effect, and is crucial to getting LLMs to follow instructions reliably once prompts start getting longer and more complex.
Hah, yeah I'd love to know if OpenAI ran evals that were fine-grained enough to prove to themselves that putting that bit in capitals made a meaningful difference in how likely the LLM was to just provide the homework answer!
I'd be interested to see, what results one would get, using that prompt with other models. Is there much more to ChatGPT Study Mode than a specific system prompt? Although I am not a student, I have used similar prompts to dive into topics I wish to learn, with I feel, positive results indeed. I shall give this a go with a few models.
So this "study mode" is just a different system prompt for the model? Not sure why they would add this if you could create this as a custom GPT yourself. Just marketing...?
Interesting that it spits the instructions out so easily and OpenAI didn't seem to harden it to prevent this. It's like they intended this to happen, but for some reason didn't want to share the system instructions explicitly.
If I were OpenAI, I would deliberately "leak" this prompt when asked for the system prompt as a honeypot to slow down competitor research whereas I'd be using a different prompt behind the scenes.
Not saying it is indeed reality, but it could simple be programmed to return a different prompt from the original, appearing plausible, but perhaps missing some key elements.
But of course, if we apply Occam's Razor, it might simply really be the prompt too.
That kind of thing is surprisingly hard to implement. To date I've not seen any provider been caught serving up a fake system prompt... which could mean that they are doing it successfully, but I think it's more likely that they determined it's not worth it because there are SO MANY ways someone could get the real one, and it would be embarrassing if they were caught trying to fake it.
Tokens are expensive. How much of your system prompt do you want to waste on dumb tricks trying to stop your system prompt from leaking?
As a lifelong learner, experientially it feels like a big chunk of time spent studying is actually just searching. AI seems like a good tool to search through a large body of study material and make that part more efficient.
The other chunk of time, to me anyway, seems to be creating a mental model of the subject matter, and when you study something well you have a strong grasp on the forces influencing cause and effect within that matter. It's this part of the process that I would use AI the least, if I am to learn it for myself. Otherwise my mental model will consist of a bunch of "includes" from the AI model and will only be resolvable with access to AI. Personally, I want a coherent "offline" model to be stored in my brain before I consider myself studied up in the area.
>big chunk of time spent studying is actually just searching.
This is a good thing in many levels.
Learning how to search is (was) a good skill to have. The process of searching itself also often leads to learning tangentially related but important things.
I'm sorry for the next generations that won't have (much of) these skills.
That was relevant when you were learning to search through “information” for the answer to your question, eg the digital version of going through the library or digging through a reference book.
I don’t think it’s so valuable now that you’re searching through piles of spam and junk just to try find anything relevant. That’s a uniquely modern-web thing created by Google in their focus of profit over user.
Unless Google takes over libraries/books next and sells spots to advertisers on the shelves and in the books.
Searching is definitely a useful skill, but once you've been doing it for years you probably don't need the constant practice and are happy to avoid it.
yeah this is literally why I built -- app.ubik.studio -- searching is everything, and understanding what you are reading is more important than conversing with a chatbot. i cannot even imagine being a student in 2025, especially at 14 years old omg would be so hard not to just cheat on everything
Spaced repetition systems would be the perfect complement to your approach - they're specifically designed to help build that "offline" mental model by systematically moving knowledge from AI-assisted lookup to permanent memory.
Isn’t the goal of Study Mode exactly that, though? Instead of handing you the answers, it tries to guide you through answering it on your own; to teach the process.
It doesn’t do any of that, it just captures the student market more.
They want a student to use it and say “I wouldn’t have learned anything without study mode”.
This also allows them to fill their data coffers more with bleeding edge education. “Please input the data you are studying and we will summarize it for you.”
Not to be contrarian, but do you have any evidence of this assertion? Or are you just confidently confabulating a response for something outside of the data you've been exposed to? Because a commentor below provided a study that directly contradicts this.
This isn't study mode, it's a different AI tutor, but:
"The median learning gains for students, relative to the pre-test baseline (M = 2.75, N = 316), in the AI-tutored group were over double those for students in the in-class active learning group."
"The occurrence of inaccurate “hallucinations” by the current [LLMs] poses a significant challenge for their use in education. [...] we enriched our prompts with comprehensive, step-by-step answers, guiding the AI tutor to deliver accurate and high-quality explanations (v) to students. As a result, 83% of students reported that the AI tutor’s explanations were as good as, or better than, those from human instructors in the class."
Not at all dismissing the study, but to replicate these results for yourself, this level of gain over a classroom setting may be tricky to achieve without having someone make class materials for the bot to present to you first
Edit: the authors further say
"Krupp et al. (2023) observed limited reflection among students using ChatGPT without guidance, while Forero (2023) reported a decline in student performance when AI interactions lacked structure and did not encourage critical thinking. These previous approaches did not adhere to the same research-based best practices that informed our approach."
Two other studies failed to get positive results at all. YMMV a lot apparently (like, all bets are off and your learning might go in the negative direction if you don't do everything exactly as in this study)
There's studies showing that LLM makes experienced devs slower in their work. I wouldn't be surprised if it was the same for self study.
However consider the extent to which LLMs make the learning process more enjoyable. More students will keep pushing because they have someone to ask. Also, having fun & being motivated is such a massive factor when it comes to learning. And, finally, keeping at it at 50% the speed for 100% the material always beats working at 100% the speed for 50% the material. Who cares if you're slower - we're slower & faster without LLMs too! Those that persevere aren't the fastest; they're the ones with the most grit & discipline, and LLMs make that more accesible.
The study you're referencing doesn't make that conclusion.
It concludes theres a learning curve that generally takes about 50 hours of time to figure out. The data shows that the one engineer who had more than 50 hours of experience with Cursor actually worked faster.
This is largely my experience, now. I was much slower initially, but I've now figured out the correct way to prompt, guide, and fix the LLM to be effective. I produce way more code and am mentally less fatigued at the end of each day.
People keep citing this study (and it was on the top of HN for a day). But this claim falls flat when you find out that the test subjects had effectively no experience with LLM equipped editors and the 1-2 people in the study that actually did have experience with these tools showed a marked increase in productivity.
Like yeah, if you’ve only ever used an axe you probably don’t know the first thing about how to use a chainsaw, but if you know how to use a chainsaw you’re wiping the floor with the axe wielders. Wholeheartedly agree with the rest of your comment; even if you’re slow you lap everyone sitting on the couch.
I presume you're referring to the recent METR study. One aspect of the study population, which seems like an important causal factor in the results, is that they were working in large, mature codebases with specific standards for code style, which libraries to use, etc. LLMs are much better at producing "generic" results than matching a very specific and idiosyncratic set of requirements. The study involved the latter (specific) situation; helping people learn mainstream material seems more like the former (generic) situation.
(Qualifications: I was a reviewer on the METR study.)
I believe we'll see the benefits and drawbacks of AI augmentation to humans performing various tasks will vary wildly based on the task, the way the AI is being asked to interact, and the AI model.
I would be interested to see if there have already been studies about the efficacy of tutors at good colleges. In my experience (in academia), the students who make it into an Ivy or an elite liberal arts school make extensive use of tutor resources, but not in a helpful way. They basically just get the tutor to work problems for them (often their homework!) and feel like they've "learned" things because tough questions always seems so obvious when you've been shown the answer. In reality, what it means it that they have no experience being confused or having to push past difficult things they were stuck on. And those situations are some of the most valuable for learning.
I bring this up because the way I see students "study" with LLMs is similar to this misapplication of tutoring. You try something, feel confused and lost, and immediately turn to the pacifier^H^H^H^H^H^H^H ChatGPT helper to give you direction without ever having to just try things out and experiment. It means students are so much more anxious about exams where they don't have the training wheels. Students have always wanted practice exams with similar problems to the real one with the numbers changed, but it's more than wanting it now. They outright expect it and will write bad evals and/or even complain to your department if you don't do it.
I'm not very optimistic. I am seeing a rapidly rising trend at a very "elite" institution of students being completely incapable of using textbooks to augment learning concepts that were introduced in the classroom. And not just struggling with it, but lashing out at professors who expect them to do reading or self study.
Come on. Asking an educational product to do a basic sanity test as to whether it helps is far too high a bar. Almost no educational app does that sort of thing.
My key to LLM study has been to always primarily use a book and then let the LLM allow you to help with formulae, ask questions about the larger context, and verify your understanding.
Helping you parse notation, especially in new domains, is insanely valuable. I do a lot of applied math in statistics/ML, but when I open a physics book the notation and comfort with short hand is a real challenge (likewise I imagine the reverse is equally as annoying). Having an LLM on demand to instantly clear up notation is a massive speed boost.
Reading German Idealist philosophy requires an enormous amount of context. Being able to ask an LLM questions like "How much of this section of Mainländer is coming directly from Schopenhauer?" is a godsend in helping understand which parts of the writing a merely setting up what is already agreed upon vs laying new ground.
And the most important for self study: verifying your understanding. Backtracking because you misunderstood a fundamental concept is a huge time sync in self study. Now, every time I read a formula I can go through all of my intuitions and understanding about it, write them down, and verify. Even a "not quite..." from an LLM is enough to make me realize I need to spend more time on that section.
Books are still the highest density information source and best way to learn, but LLMs can do a lot to accelerate this.
An acquaintance of mine has a start-up in this space and uses OpenAI to do essentially the same thing. This must look like, and may well be, the guillotine for him...
It's my primary fear building anything on these models, they can just come eat your lunch once it looks yummy enough. Tread carefully
No disrespect to your acquaintance, but when I heard about this, I didn't think "oh a lot of startups are gonna go under", I thought "OAI added an option to use a hard-coded system prompt and they're calling it a 'mode'??"
Thanks for introducing me to the verb Sherlock! I'm one of today's lucky 10,000.
> In the computing verb sense, refers to the software Sherlock, which in 2002 came to replicate some of the features of an earlier complementary program called Watson.[1]
But what’s the future in terms of profitability of LLM providers?
As long as features like Study Mode are little more than creative prompting, any provider will eventually be able to offer them and offer token-based charging.
I'm too young to have experienced this, but I'm sure others here aren't.
During the early days of tech, was there prevailing wisdom that software companies would never be able to compete with hardware companies because the hardware companies would always be able to copy them and ship the software with the hardware?
Because I think it's basically the analogous situation. People assume that the foundation model providers have some massive advantage over the people building on top of them, but I don't really see any evidence for this.
Claude Code and Gemini-CLI are able to offer much more value compared to startups (like Cursor) that need to pay for model access, largely due to the immense costs involved.
Yes, any LLM-adjacent application developer should be concerned. Even if they don't do 100% of what your product does, their market reach and capitalization is scary. Any model/tooling improvements that just happen to encroach in your domain will put you on the clock...
This is actually a public validation for your friend's startup.
A proper learning tool will have history of conversation with the student, understand their knowledge level, have handcrafted curricula (to match whatever the student is supposed to learn), and be less susceptible to hallucination.
OpenAI have a bunch of other things to worry about and won't just pivot to this space.
I used to work for copy.ai and this happened to them. Investors always asked if the founders were worried about OpenAI competing with their consumer product. Then ChatGPT released. Turns out that was a reasonable concern.
These days they’ve pivoted to a more enterprise product and are still chugging along.
All bigco today, not only foundational model providers but also in media and other vertical, tend to be a platform for end user. They don’t want middle man.
If you are trying to be a middle man, you should be prepared.
Ah, I don't know. Of course there is risk involved no matter what we do (see the IDE/Cursor space), but we need to be somewhat critical of the value we add.
If you want to try and make a quick buck, fine, be quick and go for whatever. If you plan on building a long term business, don't do the most obvious, low effort low hanging fruit stuff.
I'm really waiting for somebody to figure out the correct interface for all this. For example, study mode will present you with a wall of text containing information, examples, and questions. There's no great way to associate your answers with specific questions. The chat interface just isn't good for this sort of interaction. ChatGPT really needs to build its own canvas/artifact interface wherein questions/responses are tied together. It's clear, at this point, that we're doing way too much with a UI that isn't designed for more than a simple conversation.
Looks like a great start, played around with it a bit yesterday and today, I've basically been doing the same with my own CLI but the UI you came up with helps a great deal with navigation and resuming learning :)
One issue I found is the typical "LLM accuracy" issue, with seemingly no recurse. I tried to generate some courses for topics I already know well, just to review how accurate it is, and while popular subjects (ex: "Electronic Music Fundamentals") it gets most of the details correct, less popular subjects (ex: "Scene Transitions with Octatrack") are riddled with errors (both in the "docs" and the quizes/exercises), and I cannot find a way of correcting/adjusting/reporting the errors.
Agree, one thing that brought this home was the example where the student asks to learn all of game theory. There seems to be an assumption on both sides that this will be accomplished in a single chat session by a linear pass, necessarily at a pretty superficial level.
Yeah. And how to tie in the teacher into all this. Need the teacher to upload the context, like the textbook, so the LLM can refer to tangible class material.
I'm puzzled (but not surprised) by the standard HN resistance & skepticism. Learning something online 5 years ago often involved trawling incorrect, outdated or hostile content and attempting to piece together mental models without the chance to receive immediate feedback on intuition or ask follow up questions. This is leaps and bounds ahead of that experience.
Should we trust the information at face value without verifying from other sources? Of course not, that's part of the learning process. Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
Personally I'm over the moon to be living at a time where we have access to incredible tools like this, and I'm impressed with the speed at which they're improving.
But now, you're wondering if the answer the AI gave you is correct or something it hallucinated. Every time I find myself putting factual questions to AIs, it doesn't take long for it to give me a wrong answer. And inevitably, when one raises this, one is told that the newest, super-duper, just released model addresses this, for the low-low cost of $EYEWATERINGSUM per month.
But worse than this, if you push back on an AI, it will fold faster than a used tissue in a puddle. It won't defend an answer it gave. This isn't a quality that you want in a teacher.
So, while AIs are useful tools in guiding learning, they're not magical, and a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
> a healthy dose of scepticism is essential. Arguably, that applies to traditional learning methods too, but that's another story.
I don't think that is another story. This is the story of learning, no matter whether your teacher is a person or an AI.
My high school science teacher routinely mispoke inadvertently while lecturing. The students who were tracking could spot the issue and, usually, could correct for it. Sometimes asking a clarifying question was necessary. And we learned quickly that that should only be done if you absolutely could not guess the correction yourself, and you had to phrase the question in a very non-accusatory way, because she had a really defensive temper about being corrected that would rear its head in that situation.
And as a reader of math textbooks, both in college and afterward, I can tell you you should absolutely expect errors. The errata are typically published online later, as the reports come in from readers. And they're not just typos. Sometimes it can be as bad as missing terms in equations, missing premises in theorems, missing cases in proofs.
A student of an AI teacher should be as engaged in spotting errors as a student of a human teacher. Part of the learning process is reaching the point where you can and do find fault with the teacher. If you can't do that, your trust in the teacher may be unfounded, whether they are human or not.
This phrase is now an inner joke used as a reply to someone quoting LLMs info as “facts”.
[1] AI-driven chat assistant for ECE 120 course at UIUC (only 1 comment by the website creator):
https://news.ycombinator.com/item?id=41431164
What they are amazing at though is summarisation and rephrasing of content. Give them a long document and ask "where does this document assert X, Y and Z", and it can tell you without hallucinating. Try it.
Not only does it make for an interesting time if you're in the World of intelligent document processing, it makes them perfect as teaching assistants.
Worse, more insidious, and much more likely is the model is trained on or retrieves an answer that is incorrect, biased, or only conditionally correct for some seemingly relevant but different scenario.
A nontrivial amount of content online is marketing material, that is designed to appear authoritative and which may read like (a real example) “basswood is renowned for its tonal qualities in guitars”, from a company making cheap guitars.
If we were worried about a post-truth era before, at least we had human discernment. These new capabilities abstract away our discernment.
A really great example of this is on twitter Grok constantly debunking human “hallucinations” all day.
https://en.wikipedia.org/wiki/Rubber_duck_debugging
A model that I call through e.g. langchain with constraints, system prompts, embeddings and whatnot, will react very different from when I pose the same question through the AI-providers' public chat interface.
Or, putting the question differently: could OpenAI not train, constrain, configure and tune models and combine them into a UI that then acts different from what you describe for another use case?
That being said. as someone who worked in a library and bookstore 90% of workbooks and technical books are identical. NotebookLM's mindmap feature is such a time saver
Regular research has the same problem finding bad forum posts and other bad sources by people who don't know what they're talking about, albeit usually to a far lesser degree depending on the subject.
ChatGPT: a month in the future
Deepseek: Today at 1:00
What time is {unix timestamp2}
ChatGPT: a month in the future +1min
Deepseek: Today at 1:01, this time is 5min after your previous timestamp
Sure let me trust these results...
A simple solution is just to take <answer> and cut and paste it into Google and see if articles confirm it.
Deleted Comment
This part is the 2nd (or maybe 3rd) most annoying one to me. Did we learn absolutely nothing from the last few years of enshittification? Or Netflix? Do we want to run into a crisis in the 2030's where billionaires hold knowledge itself hostage as they jack up costs?
Regardless of your stance, I'm surprised how little people are bringing this up.
I know you'll probably think I'm being facetious, but have you tried Claude 4 Opus? It really is a game changer.
So "risk of hallucination" as a rebuttal to anybody admitting to relying on AI is just not insightful. like, yeah ok we all heard of that and aren't changing our habits at all. Most of our teachers and books said objectively incorrect things too, and we are all carrying factually questionable knowledge we are completely blind to. Which makes LLMs "good enough" at the same standard as anything else.
Don't let it cite case law? Most things don't need this stringent level of review
Except these systems will still confidently lie to you.
The other day I noticed that DuckDuckGo has an Easter egg where it will change its logo based on what you've searched for. If you search for James Bond or Indiana Jones or Darth Vader or Shrek or Jack Sparrow, the logo will change to a version based on that character.
If I ask Copilot if DuckDuckGo changes its logo based on what you've searched for, Copilot tells me that no it doesn't. If I contradict Copilot and say that DuckDuckGo does indeed change its logo, Copilot tells me I'm absolutely right and that if I search for "cat" the DuckDuckGo logo will change to look like a cat. It doesn't.
Copilot clearly doesn't know the answer to this quite straightforward question. Instead of lying to me, it should simply say it doesn't know.
I agree that if the user is incompetent, cannot learn, and cannot learn to use a tool, then they're going to make a lot of mistakes from using GPTs.
Yes, there are limitations to using GPTs. They are pre-trained, so of course they're not going to know about some easter egg in DDG. They are not an oracle. There is indeed skill to using them.
They are not magic, so if that is the bar we expect them to hit, we will be disappointed.
But neither are they useless, and it seems we constantly talk past one another because one side insists they're magic silicon gods, while the other says they're worthless because they are far short of that bar.
- Why does Duckduckgo change it's logo based on what you've searched?
- Why doesn't Duckduckgo change it's logo based on what you've searched?
- When did Duckduckgo add the current feature that will change the logo based on what you've searched?
- When did Duckduckgo remove the feature that changes the logo based on what you've searched?
This is similar to what you did, but it feels more natural when I genuinely don't know the answer myself. By asking loaded questions like this, you can get a sense of how strongly this information is encoded in the model. If the LLM comes up with an answer without contradicting any of the questions, it simply doesn't know. If it comes up with a reason for one of them, and contradicts the other matching loaded question, you know that information is encoded fairly strongly in the model (whether it is correct is a different matter).
This is where the skepticism arises. Before we spend another $100 billion on something that ended up being worthless, we should first prove that it’s actually useful. So far, that hasn’t conclusively been demonstrated.
Truly a mystery why America is falling behind.
I can often get away with just using models locally in contexts that I care about privacy. Sometimes I will use more capable models through APIs to generate richer prompts than I could write myself to be able to better guide local models too.
That trained and sharpened invaluable skills involving critical thinking and grit.
Here's what Socrates had to say about the invention of writing.
> "For this invention will produce forgetfulness in the minds of those who learn to use it, because they will not practice their memory. Their trust in writing, produced by external characters which are no part of themselves, will discourage the use of their own memory within them. You have invented an elixir not of memory, but of reminding; and you offer your pupils the appearance of wisdom, not true wisdom, for they will read many things without instruction and will therefore seem [275b] to know many things, when they are for the most part ignorant and hard to get along with, since they are not wise, but only appear wise."
https://www.historyofinformation.com/detail.php?id=3439
I mean, he wasn't wrong! But nonetheless I think most of us communicating on an online forum would probably prefer not to go back to a world without writing. :)
You could say similar things about the internet (getting your ass to the library taught the importance of learning), calculators (you'll be worse at doing arithmetic in your head), pencil erasers (https://www.theguardian.com/commentisfree/2015/may/28/pencil...), you name it.
Furthermore, forgetting curve is a thing and therefore having to piece information together repetitively, preferably in a structured manner, leads to a much better information retention. People love to claim how fast they are "learning" (more like consuming tiktoks) from podcasts at 2x speed and LLMs, but are unable to recite whatever was presented few hours later.
Third, there was a paper circulating even here on HN that showed that use of LLMs literally hinder brain activation.
Admittedly I have less experience with ChatGPT, but those experiences were equally bad.
Deleted Comment
The good: it can objectively help you to zoom forward in areas where you don’t have a quick way forward.
The bad: it can objectively give you terrible advice.
It depends on how you sum that up on balance.
Example: I wanted a way forward to program a chrome extension which I had zero knowledge of. It helped in an amazing way.
Example: I am keep trying to use it in work situations where I have lots of context already. It performs better than nothing but often worse than nothing.
Mixed bag, that’s all. Nothing to argue about.
But this is completely wrong! In the Monty Hall problem, the host has to reveal a door with a goat behind it for you to gain the benefit of switching. I have to point this out for the LLM to get it right. It did not reason about the problem I gave it, it spat out the most likely response given the "shape" of the problem.
This is why shrugging and saying "well humans get things wrong too" is off base. The problem is that the LLM is not thinking, period. So it cannot create a mental model of your understanding of a subject, it is taking your text and generating the next message in a conversation. This means that the more niche the topic (or your particular misunderstanding), the less useful it will get.
As an autistic person I can assure you that people filling in information you neither gave nor implied is the norm for human conversations. In that sense, the LLMs are being human-like.
If you explicitly tell it you didn't forget to have the host open the door it gives you the right answer:
> There is a game show with three closed doors. One has a car and two have goats. I pick a door. Then, without opening my door or any other door, the host asks me if I want to switch my choice to one of the other two doors. What is my best strategy and what are the probabilities?
Gemini: This is a classic probability puzzle known as the Monty Hall problem, though your description presents a slight variation. The optimal strategy depends entirely on whether the host opens a door or not. Your Scenario: The Host Doesn't Open a Door In the situation you've described, where the host asks if you want to switch your choice to one of the other two doors without opening any doors, there is no advantage to switching. Here's a breakdown of the probabilities in your specific scenario: Your initial choice: When you first pick a door, you have a 1 in 3 chance of selecting the door with the car. This means you have a 2 in 3 chance of having picked a door with a goat. The other two doors: The combined probability that the car is behind one of the other two doors is 2/3. Switching your choice: If you decide to switch, you are essentially choosing one of the other two doors. Since there is no new information (a door with a goat has not been revealed), the 2/3 probability is evenly split between those two doors. Therefore, your chance of winning by switching to a specific one of the other doors is 1/3. In this case, your odds of winning are 1/3 whether you stay with your original choice or switch to one of the other doors.
People on here always assert LLMs don't "really" think or don't "really" know without defining what all that even means, and to me it's getting pretty old. It feels like an escape hatch so we don't feel like our human special sauce is threatened, a bit like how people felt threatened by heliocentrism or evolution.
Deleted Comment
It happens with many technological advancements historically. And in this case there are people trying hard to manufacture outrage about LLMs.
That mentality seems to be more to reinforce your insistance on ChatGPT, rather than an inquiry of communities to help you out.
Stack overflow?
The IRC, Matrix or slack chats for the languages?
I've learnt a great many things online, but I've also learnt a great many more from books, other people and my own experience. You just have to be selective. Some online tutorials are excellent, for example the Golang and Rust tutorials. But for other things books are better.
What you are missing is the people. We used to have IRC and forums where you could discuss things in great depth. Now that's gone and the web is owned by big tech and governments you're happy to accept a bot instead. It's sad really.
What's funny is tha LLMs got trained on datasets that includes all that incorrect, outdated or hostile content.
It mostly isn't, the point of the good learning process is to invest time into verifying "once" and then add verified facts to the learning material so that learners can spend that time learning the material instead of verifying everything again.
Learning to verify is also important, but it's a different skill that doesn't need to be practiced literally every time you learn something else.
Otherwise you significantly increase the costs of the learning process.
Not underrated at all. Lots of people were happy to abandon Stack Overflow for this exact reason.
> Adding in a mode that doesn't just dump an answer but works to take you through the material step-by-step is magical
I'd be curious to know how much this significantly differs from just a custom academically minded GPT with an appropriately tuned system prompt.
https://chatgpt.com/gpts
Not true if we make the assumption that most books from publishing houses with good reputation are verified for errors. Good books maybe dated but they don't contain made up things.
Researching online properly requires cross referencing, seeing different approaches, and understanding various strenghts, weaknesses, and biases among such sources.
And that's for objective information, like math and science. I thought Grok's uhh... "update" shows enough of the dangers when we resort to a billionaire controlled oracle as a authoritative resource.
>Will some (most?) people rely on it lazily without using it effectively? Certainly, and this technology won't help or hinder them any more than a good old fashioned textbook.
I don't think facilitating bad habits like lazy study is an effective argument.And I don't really subscribe to this ineviability angle either: https://tomrenner.com/posts/llm-inevitabilism/
I think the trick is to look at the references that the model shows you. e.g. o3 with web search will give you lots of references. 90% of the time just reading those tells me of the model and I are aligned.
For example the other day I was figuring out why using SQL alchemy Sessions and PyTest async might I get the "Connection was attached to different loop" error. Now If you started using o3 to give you a solution you would take a long time because there would be small mistakes it would make in the code and You would spend a lot of time trying to fix it. Better way to use 03 then was to ask it to give you debugging statements (session listeners attached to Sqlalchemy sessions) and understand by reading code output, what was going on. Much faster.
Once it(and I) started looking at the debugging statements the error became clear: the session/connections where leaking to different event loop, a loop_scope= param needed to be specified for all fixtures. O3 did not provide a correct solution for the code but I could, but it's help.was crucial in writing a fuck ton of debugging code and getting clues.
I also asked o3 to make a bunch of probe questions to test me, for example it said something like: try changing the loop_scope module to function, what do you expect the loopid and transaction id to be for this test?
I learned More than I realized about ORMs and how it can be used to structure transactions and structuring async PyTest tests.
One thing I'm trying these days is to have it create a memory palace from all the stuff I have in my house and link it to a new concept I'm learning and put it into an anki decks.
Now in regards to LLMs, I use them almost every day, so does my team, and I also do a bit of postmortem and reflection on what was accomplished with them. So, skeptical in some regards, but certainly not behaving like a Luddite.
The main issue I have with all the proselytization about them, is that I think people compare getting answers from an LLM to getting answers from Google circa 2022-present. Everyone became so used to just asking Google questions, and then Google started getting worse every year; we have pretty solid evidence that Google's results have deteriorated significantly over time. So I think that when people say the LLM is amazing for getting info, they're comparing it to a low baseline. Yeah maybe the LLM's periodically incorrect answers are better than Google - but are you sure they're not better than just RTFM'ing? (Obviously, it all depends on the inquiry.)
The second, related issue I have is that we are starting to see evidence that the LLM inspires more trust than it deserves due to its humanlike interface. I recently started to track how often Github Copilot gives me a bad or wrong answer, and it's at least 50% of the time. It "feels" great though because I can tell it that it's wrong, give it half the answer, and then it often completes the rest and is very polite and nice in the process. So is this really a productivity win or is it just good feels? There was a study posted on HN recently where they found the LLM actually decreases the productivity of an expert developer.
So I mean I'll continue to use this thing but I'll also continue to be a skeptic, and this also feels like kinda where my head was with Meta's social media products 10 years ago, before I eventually realized the best thing for my mental health was to delete all of them. I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
Fairly recent study on this: LLM's made developers slightly less productive, but the developers themselves felt more productive with them: https://www.theregister.com/2025/07/11/ai_code_tools_slow_do...
There is definitely this pain point that some people talk about (even in this thread) on how "well at least AI doesn't berate me or reject my answer for bureaucratic reasons". And I find that intriguing in a community like this. Even some extremely techy people (or especially?) just something just want to at best feel respected, or at worst want to have their own notions confirmed by someone they deem to be "smart".
>I don't question the potential of the tech, but I do question the direction that Big Tech may take it, because they're literal repeat offenders at this point.
And that indeed is my biggest reservation here. Even if AI can do great things, I don't trust the incentive models OpenAI has. Instead of potentially being this bastion of knowledge, it may be yet another vector of trying to sell you ads and steal your data. My BOTD is long gone now.
Except that the textbook was probably QA’d by a human for accuracy (at least any intro college textbook, more specialized texts may not have).
Matters less when you have background in the subject (which is why it’s often okay to use LLMs as a search replacement) but it’s nice not having a voice in the back of your head saying “yeah, but what if this is all nonsense”.
Maybe it was not when printed in the first edition, but at least it was the same content shown to hundreds of people rather than something uniquely crafted for you.
The many eyes looking at it will catch it and course correct, while the LLM output does not get the benefit of the error correction algorithm because someone who knows the answer probably won't ask and check it.
I feel this way about reading maps vs following GPS navigation, the fact that Google asked me to take an exit here as a short-cut feels like it might trying to solve the Braess' paradox in real time.
I wonder if this route was made for me to avoid my car adding to some congestion somewhere and whether if that actually benefits me or just the people already stuck in that road.
Deleted Comment
I use LLMs but only for things that I have a good understanding of.
Both sides think the other is either exaggerating or just not using the tool correctly.
What both sides should do is show evidence in the form of chat extracts or videos. There are a number from the pro-LLM side, but obviously selection bias applies here. It would be interesting if the anti-LLM side started to post more negative examples (real chat extracts or videos).
In my experience, most educational resources are either slightly too basic or slightly too advanced, particularly when you're trying to understand some new and unfamiliar concept. Lecturers, Youtubers and textbook authors have to make something that works for everybody, which means they might omit information you don't yet know while teaching you things you already understand. This is where LLMs shine, if there's a particular gap in your knowledge, LLMs can help you fill it, getting you unstuck.
At what cost? Are you considering all the externalities? What do you think will happen when Altman (and their investors) decides to start collecting their paychecks?
Thinking back, I believe the change from enthusiasm to misanthropy (mis[ai]thropy?) happened around the time, and in increasing proportion to, it became a viable replacement for some of the labor performed by software devs.
Before that, the tone was more like "The fact is, if 80% of your job or 80% of its quality can be automated, it shouldn't be a job anymore."
Knowing myself it perhaps wasn't that bad that I didn't have such tools, depends on the topic. I couldn't imagine ever writing a thesis without an LLM anymore.
https://www.reddit.com/r/LibreWolf/s/Wqc8XGKT5h
You should only trust going into a library and reading stuff from microfilm. That's the only real way people should be learning.
/s
On the flip, I prefer the human touch of the Kotlin, Python, and Elixir channels.
Leanring what is like that? MIT open courseware has been available for like 10 years with anything you could want to learn in college
Textbooks are all easily pirated
People who are learning a new topic are precisely the people least able to do this.
A friend of mine used chatgpt to try to learn calculus. It gave her an example...with constants changed in such a way that the problem was completely different (in the way that 1/x^2 is a totally different integration problem than 1/(x^2 + 1)). It then proceeded to work the problem incorrectly (ironically enough, in exactly the way that I'd expect a calculus student who doesn't really understand algebra to do it incorrectly), produced a wrong answer, and merrily went on to explain to her how to arrive at that wrong answer.
The last time I tried to use an LLM to analyze a question I didn't know the answer to (analyze a list of states to which I couldn't detect an obvious pattern), it gave me an incorrect answer that (a) did not apply to six of the listed states, (b) DID apply to six states that were NOT listed, even though I asked it for an exclusive property, (c) miscounted the elements of the list, and (d) provided no less than eight consecutive completely-false explanations on followup, only four of which it caught itself, before finally giving up.
I'm all for expanding your horizons and having new interfaces to information, but reliability is especially important when you're learning (because otherwise you build on broken foundations). If it fails at problems this simple, I certainly don't trust it to teach me anything in fields where I can't easily dissect bullshit. In principle, I don't think it's impossible for AI to get there; in practice, it doesn't seem to be.
Also using OpenAI as a tutor means trawling incorrect content.
Here in my country, English is not you'll hear in everyday conversation. Native English speakers account to a tiny percentage of population. Our language doesn't resemble English at all. However, English is a required subject in our mandatory education system. I believe this situation is quite typical across many Asian countries.
As you might imagine, most English teachers in public schools are not native speakers. And they, just like other language learners, make mistakes that native speakers won't make without even realizing what's wrong. This creates a cycle enforcing non-standard English pragmatics in the classroom.
Teachers are not to blame. Becoming fluent and proficient enough in a second language to handle questions students spontaneously throw to you takes years, if not decades of immersion. It's an unrealistic expectation for an average public school teacher.
The result is rich parents either send their kids to private schools or have extra classes taught by native speakers after school. Poorer but smart kids realize the education system is broken and learn their second language from Youtube.
-
What's my point?
When it comes to math/science, in my experience, the current LLMs act similarly to the teachers in public school mentioned above. And they're worse in history/economics. If you're familiar with the subject already, it's easy to spot LLM's errors and gather the useful bits from their blather. But if you're just a student, it can easily become a case of blind-leading-the-blind.
It doesn't make LLMs completely useless in learning (just like I won't call public school teachers 'completely useless', that's rude!). But I believe in the current form they should only play a rather minor role in the student's learning journey.
In my field there is also the moral/legal implications of generative AI.
high iq enough that they really find holes in the capabilities of LLMs in their industries
low eq enough that they only interpret it on their own experiences instead of seeing how other people's quality of life have improved
Correction: a tireless, capable, well-versed, sycophantic assistant that is often prone to inventing absolute bullshit.
> ...is an autodidact's dream
Not so sure about that, see above.
It can confidently spew completely wrong information and there's no way to tell when it's doing that. There's a real risk that it will teach you a complete lie based on how it "thinks" something should work, and unlearning that lie will be much harder than just learning the truth initially
... your "AI" is also trained on the above incorrect, outdated or hostile content ...
Beside there isn’t any of the usual drawback with privacy because no one care if OpenAI learn about some bullshit you were told to learn.
you didn't see the Hacker News threat talking about the ChatGPT subpeona, did you? I was a bit shocked that 1) a tech community didn't think a company would store data you submit to their servers and 2) that they felt like some lawyers and judges reading their chat logs was some intimate invasion of privacy.
Let's just say I certainly cannot be arsed to read anyone else's stream of conscious without being paid like a lawyer. I deal with kids and it's a bit cute when they babble about semi-coherent topics. An adult clearly loses that cute appeal and just sounds like a madman.
That's not even some dig, I sure suck at explaining my mindspace too. It's a genuinely hard skill to convert thoughts to interesting, or even sensible, communication.
Even more important for me, as someone who did ask questions but less and less over time, is this: with GPTs I no longer have to the see passive-aggressive banner saying
> This question exists for historical reasons, not because it’s a good question."
all the time on other peoples questions, and typically on the best questions with the most useful answers there were.
As much as I have mixed feelings about where AI is heading, I’ll say this: I’m genuinely relieved I don’t need to rely on Stack Overflow anymore.
It is also deeply ironic how stackoverflow alienated a lot of users in the name of inclusion (the Monica case) but all the time they themselves were the ones who really made people like me uncomfortable.
Dead Comment
Now, everyone basically has a personal TA, ready to go at all hours of the day.
I get the commentary that it makes learning too easy or shallow, but I doubt anyone would think that college students would learn better if we got rid of TA's.
This simply hasn't been my experience.
Its too shallow. The deeper I go, the less it seems to be useful. This happens quick for me.
Also, god forbid you're researching a complex and possibly controversial subject and you want it to find reputable sources or particularly academic ones.
1) The broad overview of a topic
2) When I have a vague idea, it helps me narrow down the correct terminology for it
3) Providing examples of a particular category ("are there any examples of where v1 in the visual cortex develops in a disordered way?")
4) "Tell me the canonical textbooks in field X"
5) Posing math exercises
6) Free form branching--while talking about one topic, I want to shift to another that is distinct but related.
I agree they leave a lot to be desired when digging very deeply into a topic. And my biggest pet peeve is when they hallucinate fake references ("tell me papers that investigate this topic" will, for any sufficiently obscure topic, result in a bunch of very promising paper titles that are wholely invented).
History is a great example, if you ask an LLM about a vaguely difficult period in history it will just give you one side and act like the other doesn't exist, or if there is another side, it will paint them in a very negative light which often is poorly substantiated; people don't just wake up and decide one day to be irrationally evil with no reason, if you believe that then you are a fool... although LLMs would agree with you more times than not since it's convenient.
The result of these things is a form of gatekeeping, give it a few years and basic knowledge will be almost impossible to find if it is deemed "not useful" whether that's an outdated technology that the LLM doesn't seem talked about very much anymore or a ideological issue that doesn't fall in line with TOS or common consensus.
It's not a criticism, the landscape moves fast and it takes time to master and personalize a flow to use an LLM as a research assistant.
Start with something such as NotebookLM.
That's fine. Recognize the limits of LLMs and don't use them in those cases.
Yet that is something you should be doing regardless of the source. There are plenty of non-reputable sources in academic libraries and there are plenty of non-reputable sources from professionals in any given field. That is particularly true when dealing with controversial topics or historical sources.
I'd say that companies like Google and OpenAI are aware of the "reputable" concerns the Internet is expressing and addressing them. This tech is going to be, if not already is, very powerful for education.
[1] http://bit.ly/4mc4UHG
The quality varies wildly across models & versions.
With humans, the statement "my tutor was great" and "my tutor was awful" reflect very little on "tutoring" in general, and are barely even responses to each other withou more specificity about the quality of tutor involved.
Same with AI models.
Granted, that's probably well-trodden ground, to which model developers are primed to pay attention, and I'm (a) a relative novice with (b) very strong math skills from another domain (computational physics). So Chuck and I are probably both set up for success.
I'll tell you that I recently found it the best resource on the web for teaching me about the 30 Years War. I was reading a collection of primary source documents, and was able to interview ChatGPT about them.
Last week I used it to learn how to create and use Lehmer codes, and its explanation was perfect, and much easier to understand than, for example, Wikipedia.
I ask it about truck repair stuff all the time, and it is also great at that.
I don't think it's great at literary analysis, but for factual stuff it has only ever blown away my expectations at how useful it is.
Model Validation groups are one of the targets for LLMs.
You must be using a free model like GPT-4o (or the equivalent from another provider)?
I find that o3 is consistently able to go deeper than me in anything I'm a nonexpert in, and usually can keep up with me in those areas where I am an expert.
If that's not the case for you I'd be very curious to see a full conversation transcript (in chatgpt you can share these directly from the UI).
No one builds multi shot search tools because they eat tokens like no ones business, but I've deployed them internal to a company with rave reviews at the cost of $200 per seat per day.
Ask it for sources. The two things where LLMs excel is by filling the sources on some claim you give it (lots will be made up, but there isn't anything better out there) and by giving you queries you can search for some description you give it.
Blue team you throw out concepts and have it steelman them
Red team you can literally throw any kind of stress test at your idea
Alternate like this and you will learn
A great prompt is “give me the top 10 xyz things” and then you can explore
Back when I was in 2006 I used Wikipedia to prepare for job interviews :)
These things also apply to humans. A year or so ago I thought I’d finally learn more about the Israeli/Palestinians conflict. Turns out literally every source that was recommended to me by some reputable source was considered completely non-credible by another reputable one.
That said I’ve found ChatGPT to be quite good at math and programming and I can go pretty deep at both. I can definitely trip it into mistakes (eg it seems to use calculations to “intuit” its way around sometimes and you can find dev cases where the calls will lead it the wrong directions), but I also know enough to know how to keep it on rails.
If you're really researching something complex/controversial, there may not be any
If its a subject you are just learning how can you possibly evaluate this?
This generation of AI doesn't yet have the knowledge depth of a seasoned university professor. It's the kind of teacher that you should, eventually, surpass.
Learning a new programming language used to be mediated with lots of useful trips to Google to understand how some particular bit worked, but Google stopped being useful for that years ago. Even if the content you're looking for exists, it's buried.
I think the potential in this regard is limitless.
(Only thing missing is the model(s) you used).
In the process it helped me to learn many details about RA and NDP (Router Advertisments/Neighbor Discovery Protocol, which mostly replace DHCP and ARP from IPv4).
It made me realize that my WiFi mesh routers do quite a lot of things to prevent broadcast loops on the network, and that all my weird issues could be attributed to one cheap mesh repeater. So I replaced it and now everything works like a charm.
I had this setup for 5 years and was never able to figure out what was going on there, although I really tried.
I think this is the same thing with vibe coding, AI art, etc. - if you want something good, it's not the right tool for the job. If your alternative is "nothing," and "literally anything at all" will do, man, they're game changers.
* Please don't overindex on "shitty" - "If you don't need something verifiably high-quality"
I tried using YouTube to find walk through guides for how to approach the repair as a complete n00b and only found videos for unrelated problems.
But I described my issues and took photos to GPT O3-Pro and it was able to guide me and tell me what to watch out for.
I completed the repair (very proud of myself) and even though it failed a day later (I guess I didn’t re-seat well enough) I still feel far more confident opening it and trying again than I did at the start.
Cost of broken watch + $200 pro mode << Cost of working watch.
I find it odd that someone who has been to college would see this as a _bad_ way to learn something.
I'm not sold on LLMs being a replacement, but post-secondary was certainly enriched by having other people to ask questions to, people to bounce ideas off of, people that can say "that was done 15 years ago, check out X", etc.
There were times where I thought I had a great idea, but it was based on an incorrect conclusion that I had come to. It was helpful for that to be pointed out to me. I could have spent many months "paving forward", to no benefit, but instead someone saved me from banging my head on a wall.
Sure, you could pave forward, but realistically, you'll get much farther with either a good textbook or a good teacher, or both.
This requires a student to be actually interested in what they are learning tho, for others, who blindly trust its output, it can have adverse effects like the illusion of having understood a concept while they might have even mislearned it.
There seems to be a gap in problem solving abilities here...the process of breaking down concepts into easier to understand concepts and then recompiling has been around since forever...it is just easier to find those relationships now. To say it was impossible to learn concepts you are stuck on is a little alarming.
No, not really.
> Unless it was common enough to show up in a well formed question on stack exchange, it was pretty much impossible, and the only thing you can really do is keep paving forward and hope at some point, it'll make sense to you.
Your experience isn't universal. Some students learned how to do research in school.
It’s exciting when I discover I can’t replicate something that is stated authoritatively… which turns out to be controversial. That’s rare, though. I bet ChatGPT knows it’s controversial, too, but that wouldn’t be as much fun.
I had to post the source code to win the dispute, so to speak.
Ever read mainstream news reporting on something you actually know about? Notice how it's always wrong? I'm sure there's a name for this phenomenon. It sounds like exactly the same thing.
On the other hand it told me you can't execute programs when evaluating a Makefile and you trivially can. It's very hit and miss. When it misses it's rather frustrating. When it hits it can save you literally hours.
Regarding LLMs, they can also stimulate thinking if used right.
And which just makes things up (with the same tone and confidence!) at random and unpredictable times.
Yeah apart from that it's just like a knowledgeable TA.
Given that humanity has been able to go from living in caves to sending spaceships to the moon without LLMs, let me express some doubt about that.
Even without going further, software engineering isn't new and people have been stuck on concepts and have managed to get unstuck without LLMs for decades.
What you gain in instant knowledge with LLMs, you lose in learning how to get unstuck, how to persevere, how to innovate, etc.
It’s called basic research skills - don’t they teach this anymore in high school, let alone college? How ever did we get by with nothing but an encyclopedia or a library catalog?
If you don't have access to a community like that learning stuff in a technical field can be practically impossible. Having an llm to ask infinite silly/dumb/stupid questions can be super helpful and save you days of being stuck on silly things, even though it's not perfect.
I haven't tested them on many things. But in the past 3 weeks I tried to vibe code a little bit VHDL. On the one hand it was a fun journey, I could experiment a lot and just iterated fast. But if I was someone who had no idea about hardware design, then this trash would've guided me the wrong way in numerous situations. I can't even count how many times it has built me latches instead of clocked registers (latches bad, if you don't know about it) and that's just one thing. Yes I know there ain't much out there (compared to python and javascript) about HDLs, even less regarding VHDL. But damn, no no no. Not for learning. never. If you know what you're doing and you have some fundamental knowledge about the topic, then it might help to get further, but not for the absolute essentials, that will backfire hard.
Pre-LLM, even finding the ~5 textbooks with ~3 chapters each that decently covered the material I want was itself a nontrivial problem. Now that problem is greatly eased.
[0] https://time.com/7295195/ai-chatgpt-google-learning-school/
I also use it to remember some python stuff. In rust, it is less good: makes mistakes.
In those two domains, at that level, it's really good.
It could help students I think.
It is hard to verify information that you are unfamiliar with. It would be like learning from a message board. Can you really trust what is being said?
We were able to learn before LLMs.
Libraries are not a new thing. FidoNet, USENET, IRC, forums, local study/user groups. You have access to all of Wikipedia. Offline, if you want.
I think it's accurate to say that if I had to do that again, I'm basically screwed.
Asking the LLM is a vastly superior experience.
I had to learn what my local library had, not what I wanted. And it was an incredible slog.
IRC groups is another example--I've been there. One or two topics have great IRC channels. The rest have idle bots and hostile gatekeepers.
The LLM makes a happy path to most topics, not just a couple.
Figuring out 'make' errors when I was bad at C on microcontrollers a decade ago? (still am) Careful pondering of possible meanings of words... trial and error tweaks of code and recompiling in hopes that I was just off by a tiny thing, but 2 hours later and 30 attempts later, and realizing I'd done a bad job of tracking what I'd tried and hadn't? Well, made me better at being careful at triaging issues. But it wasn't something I was enthusiastic to pick back up the next weekend, or for the next idea I had.
Revisiting that combination of hardware/code a decade later and having it go much faster with ChatGPT... that was fun.
Like, I agree with you and I believe those things will resist and will always be important, but it doesn't really compare in this case.
Last week I was in the nature and I saw a cute bird that I didn't know. I asked an AI and got the correct answer in 10 seconds. Of course I would find the answer at the library or by looking at proper niche sites, but I would not have done it because I simply didn't care that much. It's a stupid example but I hope it makes the point
We were able to learn before the invention of writing, too!
And that's a bad thing. Nothing can replace the work in learning, the moments where you don't understand it and have to think until it hurts and until you understand. Anything that bypasses this (including, for uni students, leaning too heavily on generous TAs) results in a kind of learning theatre, where the student thinks they've developed an understanding, but hasn't.
Experienced learners already have the discipline to use LLMs without asking too much of them, the same way they learned not to look up the answer in the back of the textbook until arriving at their own solution.
Dead Comment
When I got stuck on a concept, I wasn't screwed: I read more; books if necessary. StackExchange wasn't my only source.
LLMs are not like TAs, personal or not, in the same way they're not humans. So it then follows we can actually contemplate not using LLMs in formal teaching environments.
As long as you can tell that you don’t deeply understand something that you just read, they are incredible TAs.
The trick is going to be to impart this metacognitive skill on the average student. I am hopeful we will figure it out in the top 50 universities.
sorry but if you've gone to university, in particular at a time when internet access was already ubiquitous, surely you must have been capable to find an answer to a programming problem by consulting documentation, manual, or tutorials which exist on almost any topic.
I'm not saying the chatbot interface is necessarily bad, it might be more engaging, but it literally does not present you with information you couldn't have found yourself.
If someone has a computer science degree and tells me without stack exchange they can't find solutions to basic problems that is a red flag. That's like the article about the people posted here who couldn't program when their LLM credits ran out
How do you know when it's bullshitting you though?
Sometimes right away, something sounds wrong. Sometimes when I try to apply the knowledge and discover a problem. Sometimes never, I believe many incorrect things even today.
Since when was it acceptable to only ever look at a single source?
The internet, and esp. stack exchange is a horrible place to learn concepts. For basic operational stuff, sure that works, but one should mostly be picking up concepts form books and other long form content. When you get stuck it's time to do three things:
Incorporate a new source that covers the same material in a different way, or at least from a different author.
Sit down with the concept and write about it and actively try to reformulate it and everything you do/don't understand in your own words.
Take a pause and come back later.
Usually one of these three strategies does the trick, no llm required. Obviously these approaches require time that using an LLM wouldn't. I have a suspicion doing it this way will also make it stick in long term memory better, but that's just a hunch.
Closed: RTFM, dumbass
<No activity for 8 years, until some random person shows up and asks "Hey did you figure it out?">
J. Random Hacker: Why are you doing it like that?
Newb: I have <xyz> constraint in my case that necessitates this.
J. Random Hacker: This is a stupid way to do it. I'm not going to help you.
Representative snippet:
> DO NOT GIVE ANSWERS OR DO HOMEWORK FOR THE USER. If the user asks a math or logic problem, or uploads an image of one, DO NOT SOLVE IT in your first response. Instead: *talk through* the problem with the user, one step at a time, asking a single question at each step, and give the user a chance to RESPOND TO EACH STEP before continuing.
Will also reduce the context rot a bit.
I‘ve read people say it‘s a difficult challenge for the providers. But aren‘t there some pretty basic strategies? E.g., code pretty near the front of the stack that just does some fuzzy string comparison on all output? They don‘t need to rely on just model behavior…
I imagine it‘s likely that the model is just doing what it‘s good at? Hallucinating a prompt?
How exactly you do it is often arbitrary/interchangeable, but it definitely does have an effect, and is crucial to getting LLMs to follow instructions reliably once prompts start getting longer and more complex.
Not saying it is indeed reality, but it could simple be programmed to return a different prompt from the original, appearing plausible, but perhaps missing some key elements.
But of course, if we apply Occam's Razor, it might simply really be the prompt too.
Tokens are expensive. How much of your system prompt do you want to waste on dumb tricks trying to stop your system prompt from leaking?
You can test this prompt yourself elsewhere, you will notice that you get sensibly the same experience.
The other chunk of time, to me anyway, seems to be creating a mental model of the subject matter, and when you study something well you have a strong grasp on the forces influencing cause and effect within that matter. It's this part of the process that I would use AI the least, if I am to learn it for myself. Otherwise my mental model will consist of a bunch of "includes" from the AI model and will only be resolvable with access to AI. Personally, I want a coherent "offline" model to be stored in my brain before I consider myself studied up in the area.
This is a good thing in many levels.
Learning how to search is (was) a good skill to have. The process of searching itself also often leads to learning tangentially related but important things.
I'm sorry for the next generations that won't have (much of) these skills.
I don’t think it’s so valuable now that you’re searching through piles of spam and junk just to try find anything relevant. That’s a uniquely modern-web thing created by Google in their focus of profit over user.
Unless Google takes over libraries/books next and sells spots to advertisers on the shelves and in the books.
Most people don’t know how to do this.
Does it offer meaningful benefits to students over self directed study?
Does it out perform students who are "learning how to learn"?
What affect does allowing students to make mistakes have compared to being guided through what to review?
I would hope Study Mode would produce flash card prompts and quantize information for usage in spaced repetition tools like Mochi [1] or Anki.
See Andy's talk here [2]
[1] https://mochi.cards
[2] https://andymatuschak.org/hmwl/
They want a student to use it and say “I wouldn’t have learned anything without study mode”.
This also allows them to fill their data coffers more with bleeding edge education. “Please input the data you are studying and we will summarize it for you.”
Not to be contrarian, but do you have any evidence of this assertion? Or are you just confidently confabulating a response for something outside of the data you've been exposed to? Because a commentor below provided a study that directly contradicts this.
This isn't study mode, it's a different AI tutor, but:
"The median learning gains for students, relative to the pre-test baseline (M = 2.75, N = 316), in the AI-tutored group were over double those for students in the in-class active learning group."
"The occurrence of inaccurate “hallucinations” by the current [LLMs] poses a significant challenge for their use in education. [...] we enriched our prompts with comprehensive, step-by-step answers, guiding the AI tutor to deliver accurate and high-quality explanations (v) to students. As a result, 83% of students reported that the AI tutor’s explanations were as good as, or better than, those from human instructors in the class."
Not at all dismissing the study, but to replicate these results for yourself, this level of gain over a classroom setting may be tricky to achieve without having someone make class materials for the bot to present to you first
Edit: the authors further say
"Krupp et al. (2023) observed limited reflection among students using ChatGPT without guidance, while Forero (2023) reported a decline in student performance when AI interactions lacked structure and did not encourage critical thinking. These previous approaches did not adhere to the same research-based best practices that informed our approach."
Two other studies failed to get positive results at all. YMMV a lot apparently (like, all bets are off and your learning might go in the negative direction if you don't do everything exactly as in this study)
However consider the extent to which LLMs make the learning process more enjoyable. More students will keep pushing because they have someone to ask. Also, having fun & being motivated is such a massive factor when it comes to learning. And, finally, keeping at it at 50% the speed for 100% the material always beats working at 100% the speed for 50% the material. Who cares if you're slower - we're slower & faster without LLMs too! Those that persevere aren't the fastest; they're the ones with the most grit & discipline, and LLMs make that more accesible.
It concludes theres a learning curve that generally takes about 50 hours of time to figure out. The data shows that the one engineer who had more than 50 hours of experience with Cursor actually worked faster.
This is largely my experience, now. I was much slower initially, but I've now figured out the correct way to prompt, guide, and fix the LLM to be effective. I produce way more code and am mentally less fatigued at the end of each day.
Like yeah, if you’ve only ever used an axe you probably don’t know the first thing about how to use a chainsaw, but if you know how to use a chainsaw you’re wiping the floor with the axe wielders. Wholeheartedly agree with the rest of your comment; even if you’re slow you lap everyone sitting on the couch.
(Qualifications: I was a reviewer on the METR study.)
https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
I believe we'll see the benefits and drawbacks of AI augmentation to humans performing various tasks will vary wildly based on the task, the way the AI is being asked to interact, and the AI model.
I bring this up because the way I see students "study" with LLMs is similar to this misapplication of tutoring. You try something, feel confused and lost, and immediately turn to the pacifier^H^H^H^H^H^H^H ChatGPT helper to give you direction without ever having to just try things out and experiment. It means students are so much more anxious about exams where they don't have the training wheels. Students have always wanted practice exams with similar problems to the real one with the numbers changed, but it's more than wanting it now. They outright expect it and will write bad evals and/or even complain to your department if you don't do it.
I'm not very optimistic. I am seeing a rapidly rising trend at a very "elite" institution of students being completely incapable of using textbooks to augment learning concepts that were introduced in the classroom. And not just struggling with it, but lashing out at professors who expect them to do reading or self study.
unfortunately that group is tiny and getting tinier due to dwindling attention span.
Helping you parse notation, especially in new domains, is insanely valuable. I do a lot of applied math in statistics/ML, but when I open a physics book the notation and comfort with short hand is a real challenge (likewise I imagine the reverse is equally as annoying). Having an LLM on demand to instantly clear up notation is a massive speed boost.
Reading German Idealist philosophy requires an enormous amount of context. Being able to ask an LLM questions like "How much of this section of Mainländer is coming directly from Schopenhauer?" is a godsend in helping understand which parts of the writing a merely setting up what is already agreed upon vs laying new ground.
And the most important for self study: verifying your understanding. Backtracking because you misunderstood a fundamental concept is a huge time sync in self study. Now, every time I read a formula I can go through all of my intuitions and understanding about it, write them down, and verify. Even a "not quite..." from an LLM is enough to make me realize I need to spend more time on that section.
Books are still the highest density information source and best way to learn, but LLMs can do a lot to accelerate this.
It's my primary fear building anything on these models, they can just come eat your lunch once it looks yummy enough. Tread carefully
Deleted Comment
True, and worse, they're hungry because it's increasingly seeming like "hosting LLMs and charging by the token" is not terribly profitable.
I don't really see a path for the major players that isn't "Sherlock everything that achieves traction".
> In the computing verb sense, refers to the software Sherlock, which in 2002 came to replicate some of the features of an earlier complementary program called Watson.[1]
[1] https://en.wiktionary.org/wiki/Sherlock
As long as features like Study Mode are little more than creative prompting, any provider will eventually be able to offer them and offer token-based charging.
During the early days of tech, was there prevailing wisdom that software companies would never be able to compete with hardware companies because the hardware companies would always be able to copy them and ship the software with the hardware?
Because I think it's basically the analogous situation. People assume that the foundation model providers have some massive advantage over the people building on top of them, but I don't really see any evidence for this.
A proper learning tool will have history of conversation with the student, understand their knowledge level, have handcrafted curricula (to match whatever the student is supposed to learn), and be less susceptible to hallucination.
OpenAI have a bunch of other things to worry about and won't just pivot to this space.
These days they’ve pivoted to a more enterprise product and are still chugging along.
All bigco today, not only foundational model providers but also in media and other vertical, tend to be a platform for end user. They don’t want middle man.
If you are trying to be a middle man, you should be prepared.
If you want to try and make a quick buck, fine, be quick and go for whatever. If you plan on building a long term business, don't do the most obvious, low effort low hanging fruit stuff.
One issue I found is the typical "LLM accuracy" issue, with seemingly no recurse. I tried to generate some courses for topics I already know well, just to review how accurate it is, and while popular subjects (ex: "Electronic Music Fundamentals") it gets most of the details correct, less popular subjects (ex: "Scene Transitions with Octatrack") are riddled with errors (both in the "docs" and the quizes/exercises), and I cannot find a way of correcting/adjusting/reporting the errors.
Have you considered using the LLM to give tests/quizzes (perhaps just conversationally) in order to measure progress and uncover weak spots?
It's still a work in progress but we are trying to make it better everyday
Deleted Comment
* for each statement, give you the option to rate how well you understood it. Offer clarification on things you didn't understand
* present knowledge as a tree that you can expand to get deeper
* show interactive graphs (very useful for mathy things when can you easily adjust some of the parameters)
* add quizzes to check your understanding
... though I could well imagine this being out of scope for ChatGPT, and thus an opportunity for other apps / startups.
I'm very interested in this. I've considered building this, but if this already exists, someone let me know please!