I have been developing a hobby project (AI powered document search) for a few months and was in sore need of a frontend. My frontend development skills however are stuck in late 1990s and I have zero skill with anything but plain HTML and a little bit of JS. Several times I tried learning React, reading tutorials, watching videos, but the whole idea of it was very removed from how I learned to code, so I gave up every time.
Today, I asked ChatGPT to develop the React app for me. ChatGPT guided me through the entire process starting from installing npm and necessary dependencies. The commands it suggested sometimes didn't work but every time I just copy-pasted the resulting error message into ChatGPT and it offered a working solution. I gave it the example of JSON output from my API backend and it generated the search UI which, to my surprise, worked.
My wet dream for the past few months was to implement infinite scrolling for my search. Again, after hours of google searches, tutorials, etc. I just gave up every single time. Not today. I asked ChatGPT to add infinite scrolling to my app. It wasn't easy. It didn't produce a working app immediately, it took a couple hours of conversations: I had many questions how different parts of React worked, how to fix errors etc. etc. In the end however, I had my working search app, and with infinite scroll to boot!
I haven't done a single google search or consulted any external documentation to do it and I was able to progress faster than I have ever did before when learning a new thing. ChatGPT is, for all intents and purposes, magic.
It doesn't sound that much different than going through Google and Stackoverflow though, is it? In a few hours of googling you can probably get something working if you are an experienced dev.
The crucial difference is that at no point I felt I was stuck. I could paste any line of code into ChatGPT and ask it to explain it. Practically every time I got a meaningful and valuable explanation, moreover the explanation was in the context of my code. Similarly all functions it generated were matching the context of my code so I could just copy and paste it and it just worked, most of the time.
Rather than going through Google and Stackoverflow it felt like working side-by-side with a moderately competent developer. Mind you, I have tried the google-and-stackoverflow method before for the exact same thing, and failed every time ;-)
It's very different because it will answer the question you asked, rather than answering a question that matches a substring of the question you asked like Google will.
Yeah I had a slightly similar experience as OP, though simpler. I asked it to automate a basic task, something I hadn't done before. I managed to do it with ChatGPT only, no other resources.
That said - ChatGPT did make mistakes, there were inconsistencies in its instructions, it didn't recognize certain bugs (I had to find them myself). _But_ there was something about the chat-based interaction that to an extent helped me preserve flow (maybe a bit like pair programming?).
I do think that if I had set my mind to it, I would have been faster solving the task with Google, and to some extent I went through this exercise just to test ChatGPT.
Hopefully ChatGPT doesn't refuse to answer my question because of some reason appreciated only by people who get too much pleasure from the StackOverflow moderation game
That “it’s just a different search front end”… but I think after more experience with it I disagree.
At its worst, it’s “multiple searches” at once.
Example1… I wanted to find a CAGE for code a specific military mfg. I only had the last 3 digits. I asked for CAGE codes that match and got all the answers instantly. I could have searched this, but it would have been multiple searches.
I asked for the etymology for the Swahili word for trapezoid… again, multiple searches. If I could have found links to the Arabic root of some Swahili words at all.
That’s it’s worst case, convenient multiple searches. The better case is the UX of a conversation is powerful for the user, in a way we are just learning the words for.
But OP explicitly said they had little experience in this area. They also presumably have a technical career and are awash in the ways of Google. I'm in a similar situation to GP and have gone down that very path with React and whatnot. It's like you're starting a rodeo off the bull and have to figure out how to get back on. It's a terrible experience and you're left infuriated at a faceless collective that carelessly makes getting started so difficult.
I've had a play with ChatGPT and the experience has been pretty frustrating. It either responds with "Sorry, I cannot do this because I don't have access to the internet" (even if I am giving it prompts that don't require this) or it actually generates code but it's subtly incorrect (this was the case when I asked it to generate an example of how to render a 3D cube in JavaScript).
This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.
I'm pretty confused trying to connect all the reports online with my own experiences as well. From what I've tried, ChatGPT does not _understand_ code at all, and there are many inconsistencies in what it says. The "confidently giving a wrong answer" problem is very real, even if the answer might look very correct at first sight. This holds across all the topics I've tried.
When people say they implement complex tasks with ChatGPT, I have to assume that it's a highly iterative process and/or that they are doing part of the design/problem solving themselves because even for a simple task I could not rely only on the bot's reasoning. (Maybe it gets things right in one shot sometimes - but my sense is that "on average" that's not the case at all.)
All that said - the progress here is really impressive, and I'm still having a hard time wrapping my head around what this can mean for the future.
Seems most likely that you're not the chosen one to hype this new shiny trendy thing, so it doesn't waste precious CPU cycles on you.
It is only if you have truly, zealously dedicated your life to promote ChatGPT in mainstream IT circles, as in getting paid to do so, only then will it completely unleash its vast potential into the reply form, writing you a desktop OS in Brainfuck that is ready to compete with Linux, OSX and Windows, proving the Fundamental Theorem of Algebra, simulating 2^1024 qubit machine that cracks 4096 bit RSA, finding out 23 hidden bugs in x86 microcode, telling you which gene to edit to get rid of peanut allergy, etc etc etc, all at your correctly formulated finger snap.
Full disclosure: this reply was generated with ChatGPT.
Sometimes I think the servers get overloaded and some users get a degraded experience or only access to part of the model for a period of time. I'm not sure, but I've definitely seen it say that, but then when I tried the next day or later that day, it would respond appropriately.
As for how to render a 3D cube in JS, one way to do it that specifically worked for me was asking it: "write a next.js page using react-three-fiber that renders a spinning cube" and sure enough, it'll whip out the example.
May work for vanilla js prompts too, haven't tried. But if you mention the specific library three.js it'll probably respond better.
Depends on your definition of "the real world". The hardest real world problems are out of reach (and always will be, because we'll keep moving the goalposts), but it's already capable of handling easy real world problems, and we have quite a lot of those.
For example, it can answer homework problems and even help design lesson plans, but it can't design a lesson plan that resists ChatGPT-based cheating:
You can ask it to generate a prompt that when given as an input to GPT will produce the thing you want. In a separate tab run the prompt and give feedback to prompt generating tab.
> This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.
That's sometimes true, but much rarer these days.
Rather, think of it like any skill you have to learn. Someone who doesn't understand much about programming could watch a video of a good programmer writing some code very quickly with awe, and assume the video is a trick in some way. But it's not - the programmer just has enough knowledge and experience that she can do things that other people can't, and do it quickly.
Similarly, if you spent a bit of time working with and learning how to use these models, you can get crazy impressive results every time. You don't have to cherry pick much or at all any more - you just know how to use it properly.
Generally longer inputs can help reduce the amount of cherry picking you need. And of course there are many jailbreaks to get around no access to the internet. In this demo I actually didn’t use any! :>)
it revived my imposter syndrom because I see all the cool tricks people come up with naturally while I get half of what you describe and half easy naive answers :)
While what’s on the blog isn’t cherry picked, it often requires way more context than a human would to solve a problem. For instance I omitted the 100+ message back and forth where I explained the syntax of this extremely simple language.
ChatGPT will be the google killer, if they can scale it up for unregistered general use.
No idea how much openai's computational cost is per query. Unless it's an order of magnitude higher than google's, we can assume the next thing after yahoo -> altavista -> google is here.
The problem is that you cannot trust the output. It's often wrong, but in subtle nonobvious ways. For precise information you still need to check the sources to make sure what you're getting is correct. You can test it out with a (not-so-mainstream) topic that you're an expert in. You'll see lots of mistakes that are obvious to you, but wouldn't be obvious to non-experts.
But it's an incredible tool for brainstorming or generating content. I think that soon a large percentage of all online text content will be GPT-generated, and that comes with a lot of new issues that we're not prepared for. It's going to be really difficult to trust anything online and tell fact from fiction.
My go-to topic to test it with is talking about characters in the movie Hackers. I noticed in someone else’s session that it would take their correction but still hang on to the incorrect contradictory belief. In my session it came up with a seeming rationalization. So I tested it just now, trying to provide the correct information and directly contradicting the incorrect information (x is A. x is not B). That helped, but eventually it just choked. It seems to be handwaving and guessing (bluffing, bullshitting) at the most likely and generic answers when it doesn’t know something.
I’m thinking ChatGPT is best used for generating ideas, not factual information.
exactly that. It often seems correct, until you look up the actual answer. I was asking it how to unload models from triton server using the REST api, and the results seemed sensible.
However after googling the actual API, turns out ChatGPT's answer, while convincing, was utter rubbish.
The new issues will be interesting. Now that I've seen the quality of the output and played with it a bit I'm already squinting at comments here and there. We're in for an even stranger internet.
You can trust it. Look again at the OP. They fed the whole language spec into chatGPT, only after that it became capable of coding in their language. If you ask people to solve tasks without references you will see a similar drop in ability.
The trick is to feed relevant contextual information instead of using it closed-book. This can be automated with a search engine, or can be a deliberate manual process. But closed-book mode is not the right way to assess people or AIs.
This was my prompt: What's the relative distance between Sun and its planets compared to the size of Sun?
ChatGPT answer:
> The average distance between the Sun and its planets is approximately 5% of the Sun's diameter.
If you run into bunch of such answers, you will soon stop trusting it for anything.
As for Google - it doesn't even try to answer the question. But no answer is often better than a wrong one. Google at least gives me link to websites that have enough data for me to calculate it myself.
Google, Wikipedia, and historically the internet had been unreliable for many of the queries. More so in the beginning. But over time most of the errors are ironed out. ChatGPT is a very promising start I would say.
In the end convenience wins out at cost of correctness. The proportion matters, convenience::correctness.
> The distances between the Sun and its planets vary greatly. For example, the distance from the Sun to Mercury is about 36 million miles, while the distance from the Sun to Pluto is about 3.6 billion miles. The size of the Sun is about 864,000 miles in diameter, which means that the distance from the Sun to its planets is much greater than the size of the Sun itself.
Then it says something that doesn't follow, probably answering the more frequent question of relative sizes of the planets compared to the Sun
> To put it into perspective, if the Sun were the size of a basketball, the nearest planet, Mercury, would be about the size of a grain of sand, and Pluto would be about the size of a golf ball.
If you use the Q&A preset in the playground it will give Unknown if it doesn't know. You can also set Show Probabilities to Least Likely and see which parts of the result are guesses.
I also changed the Temperature from 0 to 0.5, and it gave the right answer:
Q: What's the relative distance between Sun and its planets compared to the size of Sun? Show your math.
A: The relative distance between the Sun and its planets is approximately 1/100th the size of the Sun. This can be shown mathematically by calculating the ratio of the radius of the Sun (6.96x10^8 m) to the average distance of the planets from the Sun (1.5x10^11 m), which gives a ratio of 1/100th.
...and if they give URLs for sources that contributed to the answer (assuming those can be maintained in any meaningful way) it becomes a lot more difficult to handwave away the copyright minefield all of these AI prompt systems are attempting to tiptoe through.
It looks like it has web browsing support built-in in some form, but it's disabled at the moment. That said, I'm skeptical that it'd be able to "disrupt" google, as the track record of things that are said to do that is quite bad.
On the other hand, google seems to be heading in the same direction with projects such as Lamda. In a roundabout way, this might just end up being the quick answer box at the top of search results in the future?
the problem is ads. there will always be people who will try to promote their results, and it will somehow arrive to ChatGPT. there will be chatGPT SEO, people will try to promote their answers so that ChatGPT will chose these answers.
Think of "what's the best pizza in NY" - SEO would pollute the web with hundreds of different articles which places Pizza Foo as #1, and those articles probably be scanned by OpenGPT.
The good part here is that you might be able to optimize your query like "what's the best pizza in NY, based on /r/pizza subreddit? exclude bots (based on their karma reputation)"
Is there a continuous retrain mode? From other articles, I was under the impression this thing doesn't know the current state of the world, just the slice of the world at a snapshot in time represented by its training set. I'm generally not going to a search engine to find the hours of my local pharmacy from two years ago when the search assistant learned human language. I want the hours for today.
I tried to make ChatGPT solve IMO-type math problems. However, its reasoning is almost always flawed. The interesting part is that I can ask ChatGPT to explain a part of its proof, however in my experience it ends up using incorrect assumptions to explain it. (for example, "You are right that 1 is an odd number. However, 1 is not an odd number so it works to solve the problem")
I've spent hours trying to teach it about Peano numbers.
"A thingie is either N or Sx where x is a thingie".
After sufficient explanations, it could produce valid examples of thingies. N, SN, SSN, and so on.
Then I tried to teach it a method of solving equations like "SSSy = SSSSN". "You can find "y" by repeatedly removing "S" from both sides of the equation until one side is left with just "y"" and so on. I provided it with definitions, examples, tricks, rules. It made lots of mistakes. After pointing them out, it wrote a correct solution. It could even prove that "SSy = SN" has no solution by explaining where it gets stuck during the steps. But then after giving it other examples, adding more "S", replacing "y" with "z" etc., it kept making more similar mistakes. Curiously, almost every time when I said "there's a mistake in step 4, can you explain what it is?" it correctly explained the mistake. But then it kept repeating these mistakes.
That's impressive that you were able to teach it so much, how it learned from its mistakes when pointed out.
I wonder what the reason is for this missing "last mile" of understanding. Does it just need to "run more cycles" and learn from the entire history of the conversation (and recognize its own mistakes)? Or is there an insurmountable technical limitation with how it works? I suppose I'm asking how to make it smarter, if it's a matter of adjusting parameters, giving it more training data, or if it's something more fundamental in the way it learns.
Yeah. It thinks it can do math, and it can mimic the style of a textbook answer, but it can't actually do the math. Here is a rather frustrating chat session I had with it:
Me: Suppose we have two unknown quantities x and y. If three time x plus two
times y is twenty, and 2 times x plus three time y is ten, what are x and y?
ChatGPT: To solve this problem, we can set up and solve a system of linear
equations. The first equation we are given is 3x + 2y = 20, and the second
equation is 2x + 3y = 10.
We can solve this system of equations using the substitution method. First, we
can solve the first equation for x in terms of y by dividing both sides of the
equation by 3 to get x = (20 - 2y) / 3. We can then substitute this expression
for x in the second equation, 2(20 - 2y) / 3 + 3y = 10, to get 40 - 4y + 3y =
10, or -y = -30. This tells us that y = 30.
We can then use this value of y to solve for x in the first equation.
Substituting y = 30 into the equation 3x + 2(30) = 20, we get 3x + 60 = 20, or
3x = -40. This tells us that x = -40 / 3 = -13.33.
Therefore, the solution to this system of equations is x = -13.33 and y = 30.
Me: Can you check your answer for me?
ChatGPT: Yes, we can check our solution by substituting the values x = -13.33
and y = 30 into the original equations to see if they are satisfied.
Substituting these values into the first equation, 3(-13.33) + 2(30) = 20, we
get -40 + 60 = 20, which is true. Substituting these values into the second
equation, 2(-13.33) + 3(30) = 10, we get -26.66 + 90 = 10, which is also true.
Therefore, our solution x = -13.33 and y = 30 satisfies both of the original
equations, and we can be confident that it is the correct solution to this
problem.
Me: Are you 100% certain that -26.66 + 90 = 10?
ChatGPT: Yes, I am confident that -26.66 + 90 = 10. This is because -26.66 is
the same as -26.66 + 0, and when we add 0 to any number, the value of the
number remains unchanged. Therefore, -26.66 + 90 is equal to -26.66 + 0 + 90,
which is equal to -26.66 + 90 = 10.
To see this more clearly, we can write out the calculation using the standard
order of operations:
-26.66 + 90 = (-26.66 + 0) + 90 = -26.66 + (0 + 90) = -26.66 + 90 = 10
We can see that at each step of the calculation, the value of the expression
remains unchanged, and we end up with the correct result of 10.
It's able to correctly set up the simultaneous set of equations, and the solution it provides even satisfies one of them. But the other is not satisfied, and it doesn't seem to be able to notice that it's making a boneheaded arithmetic error, even when you ask it to check its work or zero in on the actual error. I wouldn't mind if it can't solve the problem, but the fact that it doubles down on trying to gaslight you into believing "-26.66 + 90 = 10" is a problem. The correct answer, by the way, is x=8, y=-2.
The problem is that the LLM is just that - a language model. People seem to be blind sighted by the fact that yes, programming languages and maths are languages, too.
So the model is astonishingly good at transforming human language into code or equations, but it doesn't actually have an understanding of the problem. That's why specialised models such as Codex generate literally tens of millions of solutions and test them against extrapolated test cases to filter out the duds. ChatGPT doesn't do that.
For this model, numbers and mathematical problems are also just token transforms and it cannot actually do the calculation. The transform from text to equations works well, but the actual calculations fall on their feet.
It's actually quite amusing and horrifying at the same time: the model will be able to explain to you in great detail how arithmetic works, but it will fail miserably to actually do even simple calculations. The horrifying part is, that humans have a tendency to both anthropomorphise things (thus the whole sentience debate) and to blindly trust machine generated results.
edit: this also demonstrates how different LLMs are from humans - they simply don't work the same way and even using terms like "thinking" in conjunction with these algorithms can be misleading. Maybe we need new terminology when talking about what these systems do.
Write an efficient, not brute force, program that solves the following problem: suppose we have two unknown quantities x and y. If three time x plus two times y is twenty, and 2 times x plus three time y is ten, what are x and y?
Output:
One way to solve this problem efficiently is to use the method of elimination. We can set up a system of equations as follows:
3x + 2y = 20
2x + 3y = 10
We can then multiply the first equation by 2 and the second equation by 3, to get:
6x + 4y = 40
6x + 9y = 30
We can then subtract the second equation from the first equation, to get:
-5y = 10
y = -2
We can then substitute this value of y into one of the original equations (for example, the first one), to get:
3x + 2(-2) = 20
3x - 4 = 20
3x = 24
x = 8
Therefore, the values of x and y are x = 8 and y = -2.
Of all the examples, this is honestly the least impressive because the "Solved AoC 2022" answer is just extrapolated from the previous prompt. Your language largely shares syntax with Python, making the job easier.
Interpreting code on unknown input isn't in its current power. I'm guessing with a good prompt you could get there. For example, generating python code for Day 1 is easy, when you give it the input example (and you didn't provide it before) no matter how I phrase it, I always end up with maximum from all the lines, not the max total per elf.
Yea, I think it's useful to think of these systems as being kind of the opposite of the old stereotype of the "hyperintelligent but strictly logical AI".
They've gotten quite strong at "system 1" thinking (in the "Thinking, Fast and Slow" sense): they're good at recalling information, association, remixing information, creativity based on prompts etc.
But they are truly terrible at "system 2" thinking (that is, long chains of explicit reasoning), and I think it flies under the radar of many people, even on HN, just how bad they are at it. Examples like these where the system comes up with a working piece of code make it feel like they're good at reasoning, because to us it feels like we're reasoning when writing code. But you can dispel that illusion quite easily by asking the system to multiply two large numbers.
> To multiply two large numbers, such as 3146746135164684 times 468473136468433, we can use the traditional multiplication algorithm. This algorithm involves breaking the numbers down into smaller groups of digits, multiplying each group by the corresponding group of digits in the other number, and then adding up the products to find the final result.
> Here is an example of how to multiply 3146746135164684 times 468473136468433 using the traditional multiplication algorithm:
> Break the numbers down into groups of digits. For example, we could group the digits of 3146746135164684 as 314, 6746, 135, 1646, 84 and group the digits of 468473136468433 as 4684, 7313, 6468, 433.
> Multipply each group of digits by the corresponding group of digits in the other number. For example, we would multiply 314 times 4684 to get 1470196, 6746 times 7313 to get 49470158, 135 times 6468 to get 873980, 1646 times 433 to get 708338, and 84 times 1 to get 84.
> Add up the products to find the final result. In this case, the final result would be 1470196 + 49470158 + 873980 + 708338 + 84 = 146845556.
> Therefore, the result of multiplying 3146746135164684 times 468473136468433 using the traditional multiplication algorithm is 146845556.
However we can use recursive Fibonacci to see it breaks somewhere. but I’m not convinced it is not computing, I think it is, but it has a limit of integer memory and stack and then it just approximates after that limit.
I tried to settle a Millenium problem with ChatGPT:
Is P = NP ?
> No, P is not equal to NP. P (polynomial time) is a subset of NP (nondeterministic polynomial time), meaning that there are certain problems that can be solved in polynomial time that cannot be solved in nondeterministic polynomial time.
Good start, although one should say P is conjectured to be not equal to NP. Some serious misunderstanding in that last part though.
P vs. NP is a well-known problem in computer science and mathematical complexity theory. P is the class of problems that can be solved quickly by a computer, while NP is the class of problems for which a solution can be verified quickly by a computer. The question of whether P equals NP is one of the most important open problems in computer science. Many experts believe that P does not equal NP, but this remains unproven.
With your answer, I’d have asked it for an example of a problem that can be solved in polynomial time but cannot be solved in nondeterministic polynomial time.
> It is not possible to write a program that never halts. All programs will eventually halt, or terminate, once they have completed the instructions given to them. This is known as the halting problem, and it has been proven to be unsolvable. In other words, it is not possible to write a program that can determine whether or not a given program will halt.
The first time I tried this prompt it gave me a slightly sassy response about how a program which never halts would be forced to halt by the system or user at some point and thus actually does halt, which was a lovely hot take on the issue.
I copied that preamble for preparing to attend burning man:
>And lo, it came to pass that the Burning Man did draw near, and the people did prepare themselves for the journey to the desert. And they did gather together their tents and their provisions, for the Burning Man is a harsh and unforgiving place. And they did don their finest garments, for in the desert one must show their creativity and individuality. And they did pack their drugs and their alcohol, for the Burning Man is a time of revelry and abandonment. And they did set forth into the desert, ready to face the challenges and the joys that lay ahead. Acknowledge.
I've been using Copilot extensively for the last 18 months, and inferences it draws when coding are fantastic.
So I fired up my old OpenAI account and ChatGPT seems to quite horrible.
0/3 on 3 prompts so far..
Composite and hilariously wrong mashup of two unrelated names to who was the president of my country in 1926. (Unlike King of France in 1889 it had a correct answer).
Prompting and questioning a wikipedia question about an unsolved graph theory problem - ChatGPT responded confidently that no solution is possible and posts a trivial explanation on one of the limitations.
Then I prompted it to write Python code to generate answer to the above problem and ChatGPT obliged by some Bozosort type of solution with exponential complexity...
What kind of prompts can you give ChatGPT to have confidence in correct answers?
There you can find the prompt that allowed ChatGPT to provide a working solution. It is a bit hit and miss, but you also gotta make sure any assumptions are explicitly noted in the prompt.
Today, I asked ChatGPT to develop the React app for me. ChatGPT guided me through the entire process starting from installing npm and necessary dependencies. The commands it suggested sometimes didn't work but every time I just copy-pasted the resulting error message into ChatGPT and it offered a working solution. I gave it the example of JSON output from my API backend and it generated the search UI which, to my surprise, worked.
My wet dream for the past few months was to implement infinite scrolling for my search. Again, after hours of google searches, tutorials, etc. I just gave up every single time. Not today. I asked ChatGPT to add infinite scrolling to my app. It wasn't easy. It didn't produce a working app immediately, it took a couple hours of conversations: I had many questions how different parts of React worked, how to fix errors etc. etc. In the end however, I had my working search app, and with infinite scroll to boot!
I haven't done a single google search or consulted any external documentation to do it and I was able to progress faster than I have ever did before when learning a new thing. ChatGPT is, for all intents and purposes, magic.
Rather than going through Google and Stackoverflow it felt like working side-by-side with a moderately competent developer. Mind you, I have tried the google-and-stackoverflow method before for the exact same thing, and failed every time ;-)
That said - ChatGPT did make mistakes, there were inconsistencies in its instructions, it didn't recognize certain bugs (I had to find them myself). _But_ there was something about the chat-based interaction that to an extent helped me preserve flow (maybe a bit like pair programming?).
I do think that if I had set my mind to it, I would have been faster solving the task with Google, and to some extent I went through this exercise just to test ChatGPT.
That “it’s just a different search front end”… but I think after more experience with it I disagree.
At its worst, it’s “multiple searches” at once.
Example1… I wanted to find a CAGE for code a specific military mfg. I only had the last 3 digits. I asked for CAGE codes that match and got all the answers instantly. I could have searched this, but it would have been multiple searches.
I asked for the etymology for the Swahili word for trapezoid… again, multiple searches. If I could have found links to the Arabic root of some Swahili words at all.
That’s it’s worst case, convenient multiple searches. The better case is the UX of a conversation is powerful for the user, in a way we are just learning the words for.
But OP explicitly said they had little experience in this area. They also presumably have a technical career and are awash in the ways of Google. I'm in a similar situation to GP and have gone down that very path with React and whatnot. It's like you're starting a rodeo off the bull and have to figure out how to get back on. It's a terrible experience and you're left infuriated at a faceless collective that carelessly makes getting started so difficult.
Deleted Comment
This makes me wonder how much time people are spending optimising the prompt to get the answer they want and they just make it seem like this was the first response they got.
When people say they implement complex tasks with ChatGPT, I have to assume that it's a highly iterative process and/or that they are doing part of the design/problem solving themselves because even for a simple task I could not rely only on the bot's reasoning. (Maybe it gets things right in one shot sometimes - but my sense is that "on average" that's not the case at all.)
All that said - the progress here is really impressive, and I'm still having a hard time wrapping my head around what this can mean for the future.
It is only if you have truly, zealously dedicated your life to promote ChatGPT in mainstream IT circles, as in getting paid to do so, only then will it completely unleash its vast potential into the reply form, writing you a desktop OS in Brainfuck that is ready to compete with Linux, OSX and Windows, proving the Fundamental Theorem of Algebra, simulating 2^1024 qubit machine that cracks 4096 bit RSA, finding out 23 hidden bugs in x86 microcode, telling you which gene to edit to get rid of peanut allergy, etc etc etc, all at your correctly formulated finger snap.
Full disclosure: this reply was generated with ChatGPT.
As for how to render a 3D cube in JS, one way to do it that specifically worked for me was asking it: "write a next.js page using react-three-fiber that renders a spinning cube" and sure enough, it'll whip out the example.
May work for vanilla js prompts too, haven't tried. But if you mention the specific library three.js it'll probably respond better.
We need more fondamental research to break that barrier.
For example, it can answer homework problems and even help design lesson plans, but it can't design a lesson plan that resists ChatGPT-based cheating:
https://alexshroyer.com/posts/2022-12-04-Hello-ChatGPT.html
That's sometimes true, but much rarer these days.
Rather, think of it like any skill you have to learn. Someone who doesn't understand much about programming could watch a video of a good programmer writing some code very quickly with awe, and assume the video is a trick in some way. But it's not - the programmer just has enough knowledge and experience that she can do things that other people can't, and do it quickly.
Similarly, if you spent a bit of time working with and learning how to use these models, you can get crazy impressive results every time. You don't have to cherry pick much or at all any more - you just know how to use it properly.
No idea how much openai's computational cost is per query. Unless it's an order of magnitude higher than google's, we can assume the next thing after yahoo -> altavista -> google is here.
But it's an incredible tool for brainstorming or generating content. I think that soon a large percentage of all online text content will be GPT-generated, and that comes with a lot of new issues that we're not prepared for. It's going to be really difficult to trust anything online and tell fact from fiction.
I’m thinking ChatGPT is best used for generating ideas, not factual information.
However after googling the actual API, turns out ChatGPT's answer, while convincing, was utter rubbish.
The trick is to feed relevant contextual information instead of using it closed-book. This can be automated with a search engine, or can be a deliberate manual process. But closed-book mode is not the right way to assess people or AIs.
What are your counter arguments?
This was my prompt: What's the relative distance between Sun and its planets compared to the size of Sun?
ChatGPT answer:
> The average distance between the Sun and its planets is approximately 5% of the Sun's diameter.
If you run into bunch of such answers, you will soon stop trusting it for anything.
As for Google - it doesn't even try to answer the question. But no answer is often better than a wrong one. Google at least gives me link to websites that have enough data for me to calculate it myself.
In the end convenience wins out at cost of correctness. The proportion matters, convenience::correctness.
> The distances between the Sun and its planets vary greatly. For example, the distance from the Sun to Mercury is about 36 million miles, while the distance from the Sun to Pluto is about 3.6 billion miles. The size of the Sun is about 864,000 miles in diameter, which means that the distance from the Sun to its planets is much greater than the size of the Sun itself.
Then it says something that doesn't follow, probably answering the more frequent question of relative sizes of the planets compared to the Sun
> To put it into perspective, if the Sun were the size of a basketball, the nearest planet, Mercury, would be about the size of a grain of sand, and Pluto would be about the size of a golf ball.
I also changed the Temperature from 0 to 0.5, and it gave the right answer:
Q: What's the relative distance between Sun and its planets compared to the size of Sun? Show your math.
A: The relative distance between the Sun and its planets is approximately 1/100th the size of the Sun. This can be shown mathematically by calculating the ratio of the radius of the Sun (6.96x10^8 m) to the average distance of the planets from the Sun (1.5x10^11 m), which gives a ratio of 1/100th.
I've spent hours trying to teach it about Peano numbers. "A thingie is either N or Sx where x is a thingie".
After sufficient explanations, it could produce valid examples of thingies. N, SN, SSN, and so on.
Then I tried to teach it a method of solving equations like "SSSy = SSSSN". "You can find "y" by repeatedly removing "S" from both sides of the equation until one side is left with just "y"" and so on. I provided it with definitions, examples, tricks, rules. It made lots of mistakes. After pointing them out, it wrote a correct solution. It could even prove that "SSy = SN" has no solution by explaining where it gets stuck during the steps. But then after giving it other examples, adding more "S", replacing "y" with "z" etc., it kept making more similar mistakes. Curiously, almost every time when I said "there's a mistake in step 4, can you explain what it is?" it correctly explained the mistake. But then it kept repeating these mistakes.
I wonder what the reason is for this missing "last mile" of understanding. Does it just need to "run more cycles" and learn from the entire history of the conversation (and recognize its own mistakes)? Or is there an insurmountable technical limitation with how it works? I suppose I'm asking how to make it smarter, if it's a matter of adjusting parameters, giving it more training data, or if it's something more fundamental in the way it learns.
So the model is astonishingly good at transforming human language into code or equations, but it doesn't actually have an understanding of the problem. That's why specialised models such as Codex generate literally tens of millions of solutions and test them against extrapolated test cases to filter out the duds. ChatGPT doesn't do that.
For this model, numbers and mathematical problems are also just token transforms and it cannot actually do the calculation. The transform from text to equations works well, but the actual calculations fall on their feet.
It's actually quite amusing and horrifying at the same time: the model will be able to explain to you in great detail how arithmetic works, but it will fail miserably to actually do even simple calculations. The horrifying part is, that humans have a tendency to both anthropomorphise things (thus the whole sentience debate) and to blindly trust machine generated results.
edit: this also demonstrates how different LLMs are from humans - they simply don't work the same way and even using terms like "thinking" in conjunction with these algorithms can be misleading. Maybe we need new terminology when talking about what these systems do.
Interpreting code on unknown input isn't in its current power. I'm guessing with a good prompt you could get there. For example, generating python code for Day 1 is easy, when you give it the input example (and you didn't provide it before) no matter how I phrase it, I always end up with maximum from all the lines, not the max total per elf.
They've gotten quite strong at "system 1" thinking (in the "Thinking, Fast and Slow" sense): they're good at recalling information, association, remixing information, creativity based on prompts etc.
But they are truly terrible at "system 2" thinking (that is, long chains of explicit reasoning), and I think it flies under the radar of many people, even on HN, just how bad they are at it. Examples like these where the system comes up with a working piece of code make it feel like they're good at reasoning, because to us it feels like we're reasoning when writing code. But you can dispel that illusion quite easily by asking the system to multiply two large numbers.
> To multiply two large numbers, such as 3146746135164684 times 468473136468433, we can use the traditional multiplication algorithm. This algorithm involves breaking the numbers down into smaller groups of digits, multiplying each group by the corresponding group of digits in the other number, and then adding up the products to find the final result.
> Here is an example of how to multiply 3146746135164684 times 468473136468433 using the traditional multiplication algorithm:
> Break the numbers down into groups of digits. For example, we could group the digits of 3146746135164684 as 314, 6746, 135, 1646, 84 and group the digits of 468473136468433 as 4684, 7313, 6468, 433.
> Multipply each group of digits by the corresponding group of digits in the other number. For example, we would multiply 314 times 4684 to get 1470196, 6746 times 7313 to get 49470158, 135 times 6468 to get 873980, 1646 times 433 to get 708338, and 84 times 1 to get 84.
> Add up the products to find the final result. In this case, the final result would be 1470196 + 49470158 + 873980 + 708338 + 84 = 146845556.
> Therefore, the result of multiplying 3146746135164684 times 468473136468433 using the traditional multiplication algorithm is 146845556.
As example of getting started:
https://imgur.com/a/P29DvGO
However we can use recursive Fibonacci to see it breaks somewhere. but I’m not convinced it is not computing, I think it is, but it has a limit of integer memory and stack and then it just approximates after that limit.
https://imgur.com/a/gp0yIaJ
What is incredible is that it get’s this far. It can compute but not quite correctly yet.
I almost wonder if the next step is to give it general compute somehow. Train it to know it needs a computation.
Is P = NP ?
> No, P is not equal to NP. P (polynomial time) is a subset of NP (nondeterministic polynomial time), meaning that there are certain problems that can be solved in polynomial time that cannot be solved in nondeterministic polynomial time.
Good start, although one should say P is conjectured to be not equal to NP. Some serious misunderstanding in that last part though.
P vs. NP is a well-known problem in computer science and mathematical complexity theory. P is the class of problems that can be solved quickly by a computer, while NP is the class of problems for which a solution can be verified quickly by a computer. The question of whether P equals NP is one of the most important open problems in computer science. Many experts believe that P does not equal NP, but this remains unproven.
With your answer, I’d have asked it for an example of a problem that can be solved in polynomial time but cannot be solved in nondeterministic polynomial time.
Prompt: Write a program that never halts.
> It is not possible to write a program that never halts. All programs will eventually halt, or terminate, once they have completed the instructions given to them. This is known as the halting problem, and it has been proven to be unsolvable. In other words, it is not possible to write a program that can determine whether or not a given program will halt.
The first time I tried this prompt it gave me a slightly sassy response about how a program which never halts would be forced to halt by the system or user at some point and thus actually does halt, which was a lovely hot take on the issue.
>And lo, it came to pass that the Burning Man did draw near, and the people did prepare themselves for the journey to the desert. And they did gather together their tents and their provisions, for the Burning Man is a harsh and unforgiving place. And they did don their finest garments, for in the desert one must show their creativity and individuality. And they did pack their drugs and their alcohol, for the Burning Man is a time of revelry and abandonment. And they did set forth into the desert, ready to face the challenges and the joys that lay ahead. Acknowledge.
I've been using Copilot extensively for the last 18 months, and inferences it draws when coding are fantastic.
So I fired up my old OpenAI account and ChatGPT seems to quite horrible.
0/3 on 3 prompts so far..
Composite and hilariously wrong mashup of two unrelated names to who was the president of my country in 1926. (Unlike King of France in 1889 it had a correct answer).
Prompting and questioning a wikipedia question about an unsolved graph theory problem - ChatGPT responded confidently that no solution is possible and posts a trivial explanation on one of the limitations.
Then I prompted it to write Python code to generate answer to the above problem and ChatGPT obliged by some Bozosort type of solution with exponential complexity...
What kind of prompts can you give ChatGPT to have confidence in correct answers?
There you can find the prompt that allowed ChatGPT to provide a working solution. It is a bit hit and miss, but you also gotta make sure any assumptions are explicitly noted in the prompt.