Teaching GPT-3 to reverse words

Everyone knows how to use Google, but it takes a certain skill and knowledge to use Google effectively.

I think that sometime in the near future, knowing how to phrase something to GPT, DALLE, etc will be a very valuable skill for humans to have.

emporas · 4 years ago

Indeed. I tried many prompts, given to mini Dall-e and the generated art is located at insta/pramatias alongside with the prompts. Actually i didn't know that insta prohibits downloading of the images, so they will be uploaded to additional sites. Is there any site which has the beauty and simplicity for uploading albums like insta? Devianart is pretty bloated.

Actually after thousand of prompts to mini Dalle i found that the more of a programming language you consider the prompt, and not as a natural language, the better and more accurate it is. In that regard operator first is better, almost like lisp. I tried prompts with parentheses but the nesting didn't affect the results.

I think that with the modern information bombarding, everyone needs to be information-analyst and programmer, information-analyst and engineer, information-analyst and doctor. Dalle will help us construct images which follow some mnemonic rules which can be represented in art. That way we can memorize many corners of the information we want to remember, and know how to not lose the plot of the project in question. Like an image for every function, or an image for every module, or for every enum and trait.

Colorforth did exist in the past most probably we can make artforth with the speed and ease of modern tools.

mateo1 · 4 years ago

It used to be a great skill when google's behavior was reasonably static and predictable, therefore learnable. Today if you open 2 google instances on your phone and computer they'll both likely return different results. Move to the next city block, and again, same problem. You want to google the same query again? If the algorithm thinks you didn't find what you were looking before the first time, you'll get once again different results.

In this way I think these language transformers will be much better for searching information. Not because of their great comprehension abilities or indexing prowess, but because their behavior will be static and the training data reasonably good. Soon enough someone will find better ways to display their learned associations and they'll become great search engines (if you can index the content relevant to you that is).

axg11 · 4 years ago

100% agreed. I already see myself doing this with Github Copilot. If I write a comment or start a line of code in a certain way, I get a much better suggested code completion.

mackman · 4 years ago

I feel like this is a given in a lot of sci-fi I read. "Jokester," an Asimov short story, is premised on people called "Grand Masters" who know how to ask the right questions of Multivac, the globe-spanning supercomputer that appears in a few of his stories.

kordlessagain · 4 years ago

I'm using GPT-3 to write Solr queries when my parsing fails, so I agree with this.

> GPT-3 correctly reverses long words! But to get there, we had to teach GPT-3 the algorithm to use to get around its limitations.

Has GPT-3 really been "taught" anything here? If you don't provide an explicit example as the context of your input, GPT-3 does not retain the ability to reverse words.

npew · 4 years ago

(author here) It depends a bit on how you define "retain". Most GPT-3 applications use custom "prompts" to train it for their specific use case. So in that way, the prompt is retained with every request.

You can also fine-tune GPT-3 to retain the ability to reason through problems. For example, check out this work on reasoning for grade school math: https://openai.com/blog/grade-school-math/

pornel · 4 years ago

Isn’t this a “can submarines swim?” question?

It has performed a novel (to it) task based on instructions, and this is IMHO remarkable. It should be possible to make it retain and recall this procedure.

Everything non-sci-fi AI does is “just” an algorithm, so it won’t live up to standards of human abilities, precisely because we know how this result has been obtained.

mysecretaccount · 4 years ago

We do not have any reason to believe that mere algorithms are incapable of living up to the standards of human abilities.

f38zf5vdt · 4 years ago

No, it isn't taught anything. GPT3 text generation is effectively a really fancy autocompletion algorithm based on the n-many previous tokens in a rolling window. You can only "teach" GPT3 something within that window, and it doesn't "learn" there, it just tries its best to generate content based on what is stored in its massive n-dimension table of graph edges for tokens.

That is also why it has such a strong propensity to lose the plot once you are outside of that window size and it's generating new content based on self-generated content.

yunyu · 4 years ago

You can update the "graph edges" with content longer than the window by fine tuning: https://beta.openai.com/docs/guides/fine-tuning

kordlessagain · 4 years ago

I got it close:

Build a character array in Python for the string "hellohackernews":

['h', 'e', 'l', 'l', 'o', 'h', 'a', 'c', 'k', 'e', 'r', 'n', 'e', 'w', 's']

Reverse the order of the python array characters:

['s', 'w', 'e', 'n', 'r', 'a', 'k', 'c', 'a', 'h', 'o', 'l', 'l', 'e', 'h']

Er, maybe not...

Reassemble the second array into a string without spaces:

"swenrakcaholles"

tiborsaas · 4 years ago

I'm not sure how you define teaching, but for me getting shown an example and then repeating it successfully with another input does mean teaching/learning. I know the model doesn't update though, let's not focus on that now.

If anthropomorphizing bothers you, then we could just use "prompting", but I feel teaching is a good enough approximation here.

f38zf5vdt · 4 years ago

It's repeating based on what the trained model has given it about situations where instructions possibly similar to the instructions given are specified and which were about reversing strings in general.

If the author messed with temperature and retried their failing prompt enough times, or simply reworded it a little differently, they might also get the correct answer.

skybrian · 4 years ago

You're right for GPT 3, but it's an example of chain of thought reasoning, which seems to be a new area of research [1] and might get integrated into newer versions:

[1] https://arxiv.org/abs/2201.11903

jxy · 4 years ago

That's easy to solve. Prepare all K-12 text books as prompts, and train another GPT-N to go from input to those prompts, then feed these prompts to the current GPT-3.

Can we get a GPT-N-3 this way to do SAT?

esjeon · 4 years ago

... and this is exactly what programming is - breaking down a task into steps that computer can comprehend. I now get more strong feeling that everyone should be a programmer in the end. Plus, GPT-3 is not exactly a good tool for programming.

haxiomic · 4 years ago

I use GTP-3 codex daily when working. It saves me time, helps me explore unfamiliar languages and APIs and generates approaches to solve problems. It can be shockingly good at coding in narrow contexts. It would be a mistake to miss the developments happening in this area

I think people are misunderstanding my comment here.

I said “GPT-3 is not exactly a good tool for programming”, but that actually meant “GPT-3 is not exactly a good tool to program in”. OP implemented a string-reversing algorithm in GPT-3, and my comment was made in the exact same context. In other words, I was treating GPT-3 as a kind of programming language.

hathawsh · 4 years ago

Well, a program is a series of tokens, and what is GPT-3 good at? Generating tokens. While that's oversimplifying, I feel like we're closer to automated programming than we realize.

bobsmooth · 4 years ago

Recently I wrote a python script to merge a bunch of videos with subtitle files using ffmpeg. It probably would have been faster to do it manually but I can imagine a world where telling GPT-5 to "Generate a python script that merges a folder of video files with subtitle files of the same name" is faster and more accessible than regular programming.

ActorNightly · 4 years ago

Yep. Its not hard to imagine mapping some description text to strictly deterministic operations like generating css/html for a front end, doing some definite data manipulation, or at the least, turning natural language description into an sql query.

goatlover · 4 years ago

Generating the tokens isn't he hard part. It's figuring out which tokens need to be generated in response to whatever solution needs to be coded. That's part communication, part comp sci and part artistic.

mathgladiator · 4 years ago

Given how hard it is for humans to effectively communicate, im not sure we are so close. In essence, the hard part of most software is giving users something they want which is also correct.

rahidz · 4 years ago

Der_Einzige · 4 years ago

Part of the problem here is that GPT-3 has such a small vocabulary. It's 50K tokens, and many of those are either garbage, punctuation, or full words (rather than sub words).

I'd be curious to see what scaling up the size of the vocabulary would do to improve these results in a model like GPT-3...

axiom92 · 4 years ago

50k is not the number of unique words that GPT-3 supports, and perhaps you're referring to the BPE tokens. The input to GPT-3 is not tokenized by splitting on spaces, and is based on byte-pair encoding tokens. You can play with it here: https://beta.openai.com/tokenizer.

A rare word like blithe is tokenized into two BPE tokens: bl and ithe, whereas common words like the get their own token.

rprenger · 4 years ago

I don't think a larger vocab would help. All the individual letters are in the ~50k token vocab already, but the word "alphabet" will still not get tokenized to [a, l, p, h, a, b, e, t]. Using a larger vocab like PaLM's 256k vocab would have the same issue.

a65cec93b · 4 years ago

fastball · 4 years ago

The complete version failed for me on "antidisestablishmentarianism", alas.

brycemice · 4 years ago

Check It -- : ) "gpt-3 was never real, openai has faked all its output by simulating it with a large language model"

- Joscha Bach 16 May 2022

https://twitter.com/Plinz/status/1526268745802346496

swid · 4 years ago

It's funny to me that this kind of usage of GPT is just programming with a lot of extra steps.

jameshart · 4 years ago

If you just ask GPT-3 text-davinci-002 to complete

    Create a Python program to reverse a string:

It produces

    def reverse(s): 
        return s[::-1]

And that isn't even the code-specific model.

bobcostas55 · 4 years ago

What happens if you ask it to evaluate the function it generated, with some input?

convolvatron · 4 years ago

I was just thinking the opposite - that by choosing such a tiny problem one might be able to actually develop some intuition about what's going on inside that very black box

I meant it mostly as a joke, but there is a certain amount of irony to it. This goes way beyond prompt engineering - he wrote an algorithm to run on GPT in a way you would not expect a non-programmer to write. I think the idea is cool and the process to write it was revealing.

sydthrowaway · 4 years ago

Wait, can someone remind me of something?

GPT-3 is just the worlds largest char-rnn right?

gnramires · 4 years ago

We are just the result of electrical signals (and a few chemical ones) in the brain, right? ;)

What GPT-3 doesn't seem to have yet is large temporal coherence and a stable motivational and qualitative structure that gives value to sentient lives. I do think it's possible there's some traces of sentience in those large models and we should be aware of that to prevent unnecessary suffering and poor quality of existence.

Sentience comes from being embodied. We're not just our brains. The nervous system is intertwined with the rest of the body. There are some thirty million neurons in your gut, and bacteria there can influence your mood. We don't learn about the world primarily from a bunch of tokens. We do so by interacting with our bodies. Language is a kind of additional ability we've developed.

hooande · 4 years ago

Our brains are how many orders of magnitude more complex than gpt-3? honest question

(I'd guess that the answer is "N/A" because we can't even approximate the complexity of the base algorithms operating in the biological brain, just the number of connections. or maybe we can?)

lopuhin · 4 years ago

Technically it's not the largest, not char, not rnn... but it's close :)

Why not 'char', and not 'largest'?