Further evidence of my justified true belief that the world will rapidly divide between those who already had programmatic control over their computers, now infinitely more productive through direct access to the GPT APIs (to say nothing of the firms that will use LLMs only internally, trained on their own codebase for example) and those who believe GPT=ChatGPT, which will slowly become but another conduit for ads through plugins. It's quite sad when even Nature magazine becomes a spokesperson for the latter group, who will continue to believe that LLMs are a racket on the order of Bitcoin.
With more and better models released, getting access to OpenAI's API is becoming less interesting.
What is interesting is having access to GPUs (cloud or real hardware). The world is divided in "has access to enough GPU power/don't has access to enough GPU power".
My prediction is that this will get worse for a while.
Maybe, but I can't help but notice the "programmatic control" is over someone else's computer. Those who care the most about having programmatic control over their computers are also those who would rather send their prompts to their own GPU rather than an Azure server farm. I believe there will be a programming revolution built on the foundation of LLMs, but it won't really take off until we can use local LLMs for the bulk of the processing.
And with computers there was a divide between those willing to learn to program and those using the applications that were programmed...whats your point? The majority of society needs a pretty big level of abstraction to be willing to use something.
Yes I think I have a similar sentiment. Articles like these would do better to take the approach of
"Oh shit, something huge is happening and we might not be intelligent enough to see the ramifications, but here's a humble attempt"
rather than
"Ha, the techies have done something they think is impressive again. It's certainly interesting, but as usual they're exaggerating it and failing to think from a nuanced human perspective. However us journalists are trained in that sort of thing, so we can help out here, and we know you readers will be all too familiar with the way those techies can only think like computers lol."
I feel it gives me about a 30% lift on mechanical tasks and a 60% lift on learning / unblocking in areas of ambiguity. I use GPT for the mechanical tasks and ChatGPT for the learning tasks. An example of the learning would be “explain to me the use of Box::pin in the context of rust futures” or some such, but also sometimes some common idiom I’m brain farting on. Searching Kagi will yield the answers, just more slowly and deeply embedded in some document or stackoverflow answer vomit requiring lot of wasted effort that fully distracts me from my flow. The fact I can ask follow up questions on areas of ambiguity is useful. When it hallucinates it generally means I’m in an area that’s either undefined as of yet, or is really niche. The nice thing about programming is hallucination feedback is basically instant so I then pull out Kagi and research a bit, and maybe 90% of the time it’s just not possible.
There has been some work done on generating code in a feedback cycle to winnow out hallucinations and it seems to work fairly well [1]- I think 99% of the challenges LLM face are primarily related to a lack of constraint, optimization, agency, and solver feedback. As they get integrated into a system with the ability to inform and constrain and guide using classic AI techniques their true value will be attainable. But they’re pretty useful even today.
N.b., I’m a 32 year veteran distinguished engineer level at FAANG and adjacent firms that programs daily.
I use GPT to write code sometimes instead of importing a library. For instance a function to breadth first traverse a directed acyclic graph and slice it into levels. Another one to find a node in a nested graph using partial paths. I could have written those functions, but GPT4 does it correctly and faster.
I almost always begin my tasks with ChatGPT, by asking it for a framework. "Design a x request that gets me results in form of this JSON, and now display it on y table according to this criteria"
It gives me framework, I adjust and expand upon it. Works wonderfully.
I even have a 'bot' running on one of the IRC channels that was almost 100% self written by ChatGPT.
That's if you're explicitly asking it questions. Have you sat down and coded with Copilot offering suggestions as you went? It's honestly incredibly helpful, especially when leveraging a new language or stack.
From my personal experience, it's quite useful but not as much as a lot of people thought it would be. The ability solving 'simple' questions is great. I used it more as a smarter google.
The second group is going to be using chat interfaces in every app whether they like it or not in under 2 years. I think it'll be a net benefit, boomers (and everyone else too) will finally get their wish of just telling the computer what to do.
I've used GPT-4 before it changed, and though impressive, writing code has never been the bottleneck for me personally. When these things can understand the business requirements and tell me what I should be building and why, with detailed sensible reasoning then I'll be hyped.
> When these things can understand the business requirements
I’ve been looking into this. Nothing definitive yet, but my hunch is that current LLM’s struggle because they lack curiosity. They will answer your question, however vague, with the first most obvious answer they can think of.
This is great, if you’re a junior team member. Super talented, very eager. But a more senior engineer approaches the problem differently. They ask more questions than provide answers. They’ll happily spend the first 25min of a 30min meeting asking heaps upon oodles of really dumb sounding questions.
Then the last 5min, that’s the magic. They now have a solution perfectly tailored to the problem at hand, with all sorts of edge cases either explored or eliminated through questioning. The questions they kept asking weren’t dumb after all, they were pruning a decision/option tree in their head of all possible solutions until they landed on the most optimal solution for a set of known constraints. With further options to dig and improve.
I think you can make an LLM do this (i’m trying), but it’s very very slow still.
I think you’re still anthropomorphizing LLMs too much by saying the problem is a lack of curiosity. It’s an engineering problem: you haven’t figured out the correct “context” from which business requirements would follow. (And why Microsoft is so incredibly well positioned for the future)
This is because curiosity requires free energy, hence it is very expensive when you're limited on very expensive compute.
This is what tree of thought is attempting to simulate in some ways. Build a set of multiple questions around the original question and then build on and prune that list based on a 'show your work' set of steps, and then keep iterating.
Humans naturally solve the halting problem when thinking about things... we work on something long enough without a break and we'll pass out. Maybe when we wake, eat, and go to work we'll stop working on the same problem. But an LLM never sleeps. In theory with TOT and no time limit, you could find out your AutoGPT spent 10 million in computing resources contemplating navel lint. So, there are a number of unsolved problems there.
What really becomes concerning is if Nvidia achieves its goals of speeding up training/inference by 1 million times in the next few years, and if the amount of compute we produce increases by a few million times. You and me simply can't use hundreds of minds thinking for years straight and machines could.
You're missing out. 90% of my boilerplate code is automatically customized and inserted in my codebase for me by copilot. Maybe "manual labor" makes you happy, that's up to you.
A lot of people think GPT4 got worse with a semi-recent change. Worse as measured by intelligence or accuracy on complex inputs. Most notably i think the change was around the time they significantly increased the speed of GPT4.
I don't have much opinion on this though, i just know i see that description of GPT4 a lot. I use GPT4 daily, but i think i don't really challenge it most days, as i've never trusted it enough to do so hah.
Anthropomorphize is a good way to get ChatGPT to code some obvious but tedious code and to remind you how to cherry pick in git or other myriad of task that most of know how to do and before we will revisit in some site. I give ChatGPT good context and then do code review of the code it presents in the same way I do code reviews with the code I see elsewhere. One warning here though. Don't ask ChatGPT to code something that you don't know how to interpret and/or not willing to test for correctness.
The article is introductory in nature, tending to be irrelevant for the HN audience.
What worries me is that I have seen people here more often commenting on the author's life, their academic or professional attributes instead of paying attention to the article.
I've been using it for weeks, but recently it was so gutted that it barely understands TypeScript. I do my own coding again because sifting through all the possible bugs is more work than just writing it yourself. A true shame, because before it got "faster" it was capable of creating entire apps.
On sites like Stack Overflow, incorrect and subtley wrong answers are often peppered with follow-up answers. But in a ChatGPT session, I alone have to critique its output, without contextual info regarding the data's source and without a community to help me critique it.
I find that a bit exhausting; beyond simple use-cases, I've found it easier to just do it myself.
As I've been moving on from writing code to higher level work, I find these tools very welcome. I can just ask it to give me a quick grid layout or the right ffmpeg string of options and go about my tasks, instead having to dive into the nitty gritty or doing "manual labour" every time.
I think they should use a different name altogether for chatGPT 4 as it is very different and superior to 3.5. And when people say chatGPT doesn't work for me, I would like to know which one they tried.
I was chatting with someone I know that has a paid subscription for ChatGPT 4. I was venting about the hallucinations I was getting for a certain problem; they mentioned how much better v4 is, and proceeded to plug my prompt into it. The answer it gave was actually a more blatant lie compared to the responses I got from v3.5.
I've been seeing this "try ChatGPT 4" advice all over HN. I am using 4, and it still hallucinates and generates quite useless code. It's only marginally better in my experience.
The world right now seems to be divided more into “have GPT-4 API/ don’t have GPT-4 API”.
What is interesting is having access to GPUs (cloud or real hardware). The world is divided in "has access to enough GPU power/don't has access to enough GPU power".
My prediction is that this will get worse for a while.
"Oh shit, something huge is happening and we might not be intelligent enough to see the ramifications, but here's a humble attempt"
rather than
"Ha, the techies have done something they think is impressive again. It's certainly interesting, but as usual they're exaggerating it and failing to think from a nuanced human perspective. However us journalists are trained in that sort of thing, so we can help out here, and we know you readers will be all too familiar with the way those techies can only think like computers lol."
How much more productive do you feel you are coding with an LLM? As another HN user said, to me it's like a talking dog -- incredible yet useless.
There has been some work done on generating code in a feedback cycle to winnow out hallucinations and it seems to work fairly well [1]- I think 99% of the challenges LLM face are primarily related to a lack of constraint, optimization, agency, and solver feedback. As they get integrated into a system with the ability to inform and constrain and guide using classic AI techniques their true value will be attainable. But they’re pretty useful even today.
N.b., I’m a 32 year veteran distinguished engineer level at FAANG and adjacent firms that programs daily.
1 https://voyager.minedojo.org
I almost always begin my tasks with ChatGPT, by asking it for a framework. "Design a x request that gets me results in form of this JSON, and now display it on y table according to this criteria"
It gives me framework, I adjust and expand upon it. Works wonderfully.
I even have a 'bot' running on one of the IRC channels that was almost 100% self written by ChatGPT.
That's if you're explicitly asking it questions. Have you sat down and coded with Copilot offering suggestions as you went? It's honestly incredibly helpful, especially when leveraging a new language or stack.
I’ve been looking into this. Nothing definitive yet, but my hunch is that current LLM’s struggle because they lack curiosity. They will answer your question, however vague, with the first most obvious answer they can think of.
This is great, if you’re a junior team member. Super talented, very eager. But a more senior engineer approaches the problem differently. They ask more questions than provide answers. They’ll happily spend the first 25min of a 30min meeting asking heaps upon oodles of really dumb sounding questions.
Then the last 5min, that’s the magic. They now have a solution perfectly tailored to the problem at hand, with all sorts of edge cases either explored or eliminated through questioning. The questions they kept asking weren’t dumb after all, they were pruning a decision/option tree in their head of all possible solutions until they landed on the most optimal solution for a set of known constraints. With further options to dig and improve.
I think you can make an LLM do this (i’m trying), but it’s very very slow still.
This is because curiosity requires free energy, hence it is very expensive when you're limited on very expensive compute.
This is what tree of thought is attempting to simulate in some ways. Build a set of multiple questions around the original question and then build on and prune that list based on a 'show your work' set of steps, and then keep iterating.
Humans naturally solve the halting problem when thinking about things... we work on something long enough without a break and we'll pass out. Maybe when we wake, eat, and go to work we'll stop working on the same problem. But an LLM never sleeps. In theory with TOT and no time limit, you could find out your AutoGPT spent 10 million in computing resources contemplating navel lint. So, there are a number of unsolved problems there.
What really becomes concerning is if Nvidia achieves its goals of speeding up training/inference by 1 million times in the next few years, and if the amount of compute we produce increases by a few million times. You and me simply can't use hundreds of minds thinking for years straight and machines could.
I don't have much opinion on this though, i just know i see that description of GPT4 a lot. I use GPT4 daily, but i think i don't really challenge it most days, as i've never trusted it enough to do so hah.
What worries me is that I have seen people here more often commenting on the author's life, their academic or professional attributes instead of paying attention to the article.
I find that a bit exhausting; beyond simple use-cases, I've found it easier to just do it myself.
Based on my own attempts to have ChatGPT generate code for me, it's better to NOT trust it. It tends to hilucinate even with simple requirements.
GPT: Alice; GPT-2: Bob; GPT-3: Carol; GPT-3.5: Mountain Carol; GPT-4: Dallas; …; GPT-19: Sydney
You mean other than the fact that it costs money?