... and this is exactly what programming is - breaking down a task into steps that computer can comprehend. I now get more strong feeling that everyone should be a programmer in the end. Plus, GPT-3 is not exactly a good tool for programming.
I use GTP-3 codex daily when working. It saves me time, helps me explore unfamiliar languages and APIs and generates approaches to solve problems. It can be shockingly good at coding in narrow contexts. It would be a mistake to miss the developments happening in this area
I think people are misunderstanding my comment here.
I said “GPT-3 is not exactly a good tool for programming”, but that actually meant “GPT-3 is not exactly a good tool to program in”. OP implemented a string-reversing algorithm in GPT-3, and my comment was made in the exact same context. In other words, I was treating GPT-3 as a kind of programming language.
Well, a program is a series of tokens, and what is GPT-3 good at? Generating tokens. While that's oversimplifying, I feel like we're closer to automated programming than we realize.
Recently I wrote a python script to merge a bunch of videos with subtitle files using ffmpeg. It probably would have been faster to do it manually but I can imagine a world where telling GPT-5 to "Generate a python script that merges a folder of video files with subtitle files of the same name" is faster and more accessible than regular programming.
Yep. Its not hard to imagine mapping some description text to strictly deterministic operations like generating css/html for a front end, doing some definite data manipulation, or at the least, turning natural language description into an sql query.
Generating the tokens isn't he hard part. It's figuring out which tokens need to be generated in response to whatever solution needs to be coded. That's part communication, part comp sci and part artistic.
Given how hard it is for humans to effectively communicate, im not sure we are so close. In essence, the hard part of most software is giving users something they want which is also correct.
Indeed. I tried many prompts, given to mini Dall-e and the generated art is located at insta/pramatias alongside with the prompts. Actually i didn't know that insta prohibits downloading of the images, so they will be uploaded to additional sites. Is there any site which has the beauty and simplicity for uploading albums like insta? Devianart is pretty bloated.
Actually after thousand of prompts to mini Dalle i found that the more of a programming language you consider the prompt, and not as a natural language, the better and more accurate it is. In that regard operator first is better, almost like lisp. I tried prompts with parentheses but the nesting didn't affect the results.
I think that with the modern information bombarding, everyone needs to be information-analyst and programmer, information-analyst and engineer, information-analyst and doctor. Dalle will help us construct images which follow some mnemonic rules which can be represented in art. That way we can memorize many corners of the information we want to remember, and know how to not lose the plot of the project in question. Like an image for every function, or an image for every module, or for every enum and trait.
Colorforth did exist in the past most probably we can make artforth with the speed and ease of modern tools.
It used to be a great skill when google's behavior was reasonably static and predictable, therefore learnable. Today if you open 2 google instances on your phone and computer they'll both likely return different results. Move to the next city block, and again, same problem. You want to google the same query again? If the algorithm thinks you didn't find what you were looking before the first time, you'll get once again different results.
In this way I think these language transformers will be much better for searching information. Not because of their great comprehension abilities or indexing prowess, but because their behavior will be static and the training data reasonably good. Soon enough someone will find better ways to display their learned associations and they'll become great search engines (if you can index the content relevant to you that is).
100% agreed. I already see myself doing this with Github Copilot. If I write a comment or start a line of code in a certain way, I get a much better suggested code completion.
I feel like this is a given in a lot of sci-fi I read. "Jokester," an Asimov short story, is premised on people called "Grand Masters" who know how to ask the right questions of Multivac, the globe-spanning supercomputer that appears in a few of his stories.
Part of the problem here is that GPT-3 has such a small vocabulary. It's 50K tokens, and many of those are either garbage, punctuation, or full words (rather than sub words).
I'd be curious to see what scaling up the size of the vocabulary would do to improve these results in a model like GPT-3...
50k is not the number of unique words that GPT-3 supports, and perhaps you're referring to the BPE tokens. The input to GPT-3 is not tokenized by splitting on spaces, and is based on byte-pair encoding tokens. You can play with it here: https://beta.openai.com/tokenizer.
A rare word like blithe is tokenized into two BPE tokens: bl and ithe, whereas common words like the get their own token.
I don't think a larger vocab would help. All the individual letters are in the ~50k token vocab already, but the word "alphabet" will still not get tokenized to [a, l, p, h, a, b, e, t]. Using a larger vocab like PaLM's 256k vocab would have the same issue.
> GPT-3 correctly reverses long words! But to get there, we had to teach GPT-3 the algorithm to use to get around its limitations.
Has GPT-3 really been "taught" anything here? If you don't provide an explicit example as the context of your input, GPT-3 does not retain the ability to reverse words.
(author here) It depends a bit on how you define "retain". Most GPT-3 applications use custom "prompts" to train it for their specific use case. So in that way, the prompt is retained with every request.
You can also fine-tune GPT-3 to retain the ability to reason through problems. For example, check out this work on reasoning for grade school math: https://openai.com/blog/grade-school-math/
It has performed a novel (to it) task based on instructions, and this is IMHO remarkable. It should be possible to make it retain and recall this procedure.
Everything non-sci-fi AI does is “just” an algorithm, so it won’t live up to standards of human abilities, precisely because we know how this result has been obtained.
No, it isn't taught anything. GPT3 text generation is effectively a really fancy autocompletion algorithm based on the n-many previous tokens in a rolling window. You can only "teach" GPT3 something within that window, and it doesn't "learn" there, it just tries its best to generate content based on what is stored in its massive n-dimension table of graph edges for tokens.
That is also why it has such a strong propensity to lose the plot once you are outside of that window size and it's generating new content based on self-generated content.
I'm not sure how you define teaching, but for me getting shown an example and then repeating it successfully with another input does mean teaching/learning. I know the model doesn't update though, let's not focus on that now.
If anthropomorphizing bothers you, then we could just use "prompting", but I feel teaching is a good enough approximation here.
It's repeating based on what the trained model has given it about situations where instructions possibly similar to the instructions given are specified and which were about reversing strings in general.
If the author messed with temperature and retried their failing prompt enough times, or simply reworded it a little differently, they might also get the correct answer.
You're right for GPT 3, but it's an example of chain of thought reasoning, which seems to be a new area of research [1] and might get integrated into newer versions:
That's easy to solve. Prepare all K-12 text books as prompts, and train another GPT-N to go from input to those prompts, then feed these prompts to the current GPT-3.
I was just thinking the opposite - that by choosing such a tiny problem one might be able to actually develop some intuition about what's going on inside that very black box
I meant it mostly as a joke, but there is a certain amount of irony to it. This goes way beyond prompt engineering - he wrote an algorithm to run on GPT in a way you would not expect a non-programmer to write. I think the idea is cool and the process to write it was revealing.
We are just the result of electrical signals (and a few chemical ones) in the brain, right? ;)
What GPT-3 doesn't seem to have yet is large temporal coherence and a stable motivational and qualitative structure that gives value to sentient lives. I do think it's possible there's some traces of sentience in those large models and we should be aware of that to prevent unnecessary suffering and poor quality of existence.
Sentience comes from being embodied. We're not just our brains. The nervous system is intertwined with the rest of the body. There are some thirty million neurons in your gut, and bacteria there can influence your mood. We don't learn about the world primarily from a bunch of tokens. We do so by interacting with our bodies. Language is a kind of additional ability we've developed.
Our brains are how many orders of magnitude more complex than gpt-3? honest question
(I'd guess that the answer is "N/A" because we can't even approximate the complexity of the base algorithms operating in the biological brain, just the number of connections. or maybe we can?)
I said “GPT-3 is not exactly a good tool for programming”, but that actually meant “GPT-3 is not exactly a good tool to program in”. OP implemented a string-reversing algorithm in GPT-3, and my comment was made in the exact same context. In other words, I was treating GPT-3 as a kind of programming language.
I think that sometime in the near future, knowing how to phrase something to GPT, DALLE, etc will be a very valuable skill for humans to have.
Actually after thousand of prompts to mini Dalle i found that the more of a programming language you consider the prompt, and not as a natural language, the better and more accurate it is. In that regard operator first is better, almost like lisp. I tried prompts with parentheses but the nesting didn't affect the results.
I think that with the modern information bombarding, everyone needs to be information-analyst and programmer, information-analyst and engineer, information-analyst and doctor. Dalle will help us construct images which follow some mnemonic rules which can be represented in art. That way we can memorize many corners of the information we want to remember, and know how to not lose the plot of the project in question. Like an image for every function, or an image for every module, or for every enum and trait.
Colorforth did exist in the past most probably we can make artforth with the speed and ease of modern tools.
In this way I think these language transformers will be much better for searching information. Not because of their great comprehension abilities or indexing prowess, but because their behavior will be static and the training data reasonably good. Soon enough someone will find better ways to display their learned associations and they'll become great search engines (if you can index the content relevant to you that is).
I'd be curious to see what scaling up the size of the vocabulary would do to improve these results in a model like GPT-3...
A rare word like blithe is tokenized into two BPE tokens: bl and ithe, whereas common words like the get their own token.
Has GPT-3 really been "taught" anything here? If you don't provide an explicit example as the context of your input, GPT-3 does not retain the ability to reverse words.
You can also fine-tune GPT-3 to retain the ability to reason through problems. For example, check out this work on reasoning for grade school math: https://openai.com/blog/grade-school-math/
It has performed a novel (to it) task based on instructions, and this is IMHO remarkable. It should be possible to make it retain and recall this procedure.
Everything non-sci-fi AI does is “just” an algorithm, so it won’t live up to standards of human abilities, precisely because we know how this result has been obtained.
That is also why it has such a strong propensity to lose the plot once you are outside of that window size and it's generating new content based on self-generated content.
Build a character array in Python for the string "hellohackernews":
['h', 'e', 'l', 'l', 'o', 'h', 'a', 'c', 'k', 'e', 'r', 'n', 'e', 'w', 's']
Reverse the order of the python array characters:
['s', 'w', 'e', 'n', 'r', 'a', 'k', 'c', 'a', 'h', 'o', 'l', 'l', 'e', 'h']
Er, maybe not...
Reassemble the second array into a string without spaces:
"swenrakcaholles"
If anthropomorphizing bothers you, then we could just use "prompting", but I feel teaching is a good enough approximation here.
If the author messed with temperature and retried their failing prompt enough times, or simply reworded it a little differently, they might also get the correct answer.
[1] https://arxiv.org/abs/2201.11903
Can we get a GPT-N-3 this way to do SAT?
- Joscha Bach 16 May 2022
https://twitter.com/Plinz/status/1526268745802346496
GPT-3 is just the worlds largest char-rnn right?
What GPT-3 doesn't seem to have yet is large temporal coherence and a stable motivational and qualitative structure that gives value to sentient lives. I do think it's possible there's some traces of sentience in those large models and we should be aware of that to prevent unnecessary suffering and poor quality of existence.
(I'd guess that the answer is "N/A" because we can't even approximate the complexity of the base algorithms operating in the biological brain, just the number of connections. or maybe we can?)